"Cannot open data source" error after dropna

kadirsahbaz commented 2 years ago

I use the following lines to open shapefile as GeoDataFrame, drop NaN values and use the final GeoDataFrame without NaN values in pygeoda. But after dropna pygeoda throws this error: ValueError: pygeoda can't open current data source. Please use either a file path of an ESRI shapefile or a GeoPandas instance., However, both gdf1 and gdf2 are GeoDataFrame with many rows.

gdf1 = gpd.read_file("/home/user/test.shp")
print(type(gdf1))
# OUT <class 'geopandas.geodataframe.GeoDataFrame'>
data1 = pygeoda.open(gdf1) # No Error here

gdf2 = gdf1.dropna()
print(type(gdf2))
# OUT <class 'geopandas.geodataframe.GeoDataFrame'>
data2 = pygeoda.open(gdf2) # ERROR

geopandas v0.9.0 and 0.10.0 pygeoda v0.0.8-1

lixun910 commented 2 years ago

from a previous issue: could you try something like: new_gdf = gdf2.set_index(“geometry”).reset_index() after dropna()?

kadirsahbaz commented 2 years ago

But I need to keep the indices to join a column produced in gdf2 to gdf1.

set_index(“geometry”) doesn't work if GeoDataFrame has features with same geometry, for example POINT (0 0). Am I wrong?

athuler commented 8 months ago

I am having the same issue - did you figure out a workaround?

GeoDaCenter / pygeoda

"Cannot open data source" error after dropna #21