rstudio / reticulate

R Interface to Python
https://rstudio.github.io/reticulate
Apache License 2.0
1.68k stars 328 forks source link

R crash when using View() for geopandas.GeoDataFrame #877

Open bohm0072 opened 4 years ago

bohm0072 commented 4 years ago

System details

RStudio Edition : Desktop
RStudio Version : Version 1.4.972
OS Version      : MacOS 10.15.7
R Version       : 4.0.3

Steps to reproduce the problem

library(tidyverse)
library(reticulate)

conda_create("reprex", packages = c("pandas", "geopandas", "shapely"))
use_condaenv(condaenv = "reprex", conda = "auto", required = TRUE)

pd <- import("pandas",convert=F)
gpd <- import("geopandas",convert=F)
shapely <- import("shapely",convert=F)

gd = list('col1'=c(1, 2), 'col2'=c(3, 4), 'geometry'=c(shapely$geometry$Point(1,2), shapely$geometry$Point(2,1)))
gdf = gpd$GeoDataFrame(gd, crs="EPSG:4326")

View(gdf)

Describe the problem in detail

When using View() on a geopandas.GeoDataFrame object, the RStudio brower window returns r error 4 (R code execution error) while the Console returns

Error in py_get_item_impl(x, key, FALSE) : 
  IndexError: index 2 is out of bounds for axis 0 with size 2

Detailed traceback: 
  File "/Users/Brian/Library/r-miniconda/envs/reprex/lib/python3.8/site-packages/geopandas/array.py", line 331, in __getitem__
    return GeometryArray(self.data[idx], crs=self.crs)

Describe the behavior you expected

Expected behavior is to view geopandas.GeoDataFrame in similar manner to pandas.DataFrame object (as newly supported in RStudio 1.4) using View() function. Unsure if this is truly a bug or just a yet unsupported feature.

kevinushey commented 4 years ago

Thank you for taking the time to prepare a reproducible example like this -- it is greatly appreciated!

It looks like this is more generally an issue with conversion of these GeoDataFrames back to R:

> reticulate::py_to_r(gdf)
Warning in format.data.frame(if (omit) x[seq_len(n0), , drop = FALSE] else x,  :
  corrupt data frame: columns will be truncated or padded with NAs
  col1 col2                      geometry
1    1    3 <environment: 0x7f82cef057c0>
2    2    4                          <NA>

In particular, it looks like we don't know how to convert the GeometryArray object to an R object, and that gives us trouble.

> df$geometry
<GeometryArray>
[<shapely.geometry.point.Point>, <shapely.geometry.point.Point>]
Length: 2, dtype: geometry