Open tfardet opened 1 year ago
So far I tried not to use geopandas
as a dependency, as it is sometimes not easy to install.
I am not really keen on having it as a dependency.
What would be the advantages of having it as a dependency?
Here it would be an optional dependency, i.e. it is not required to install pynsee
, but if available, it is used.
The advantage is that people that have geopandas
installed directly get a GeoDataFrame
with all its capabilities, without having to do the extra conversion step.
It may seem somewhat minor but the extra maintenance cost is also zero for the minimal implementation.
This also comes from the fact that I would assume that most people getting a DataFrame
with a "geometry" column expect it to be a GeoDataFrame
or a subclass, and thus expect all the GeoDataFrame
/GeoSeries
methods to work.
With this implementation, we do not break any expectations (people not expecting anything do not get less than they expected, people expecting the GeoDataFrame get it).
I would find it a bit confusing to have sometimes an output that is a geopandas.GeoDataFrame
and sometimes an output that is pynsee.GeoFrDataFrame
depending on the dependencies. Such feature would mean that if a bug happens it is not straightforward to know whether it is related to geopandas
or not. As a maintainer I would prefer to not have to answer this question. I think the extra step is not a big one, at least it makes it explicit that we are not handling geopandas
object.
In the long run, if this need is expressed by more users, why not considering it, but tbh for now I think there are already enough issues to tackle.
Hi @hadrilec , I'm not sure your statement about geopandas is still relevant. I found that geopandas
installation was efficiently simplified due to the release of whl files for Fiona
on pypi less than 6 months ago. It sure could be a pain before that (particularly on windows), but I don't think that's the case anymore.
The only problems I'm still encountering with geopandas are linked to conflicts between gdal
(osgeo) and fiona
(which is another GDAL binding for python). There are cases where gdal
is mandatory (for instance, to activate geodjango
) and depending on geopandas
could indeed be problematic. (I still have to see the consequences of using both libraries...)
In any case, this is very specific and I'm not sure we need to concern ourselves with geopandas
's installation anymore...
I'm bumping this issue once more, pushing for the addition of geopandas because:
save_df
and making it faster because we could pass the library instead and load geoparquet directly with geopandas, as GeoDataFrame, rather than having to load via pandas and then cast into another class.@hadrilec could you please check again on the computer from the INSEE so we can move forward with this if you confirm there are no installation issues?
I will give it another try
EDIT: given the current state of geopandas packaging via wheels, I propose adding it as a real dependency and removing
GeoFrDataFrame
I see that
geopandas
was removed on purpose as a dependency frompynsee
in from #6 Though I understand the rationale, I do wonder whether, in the long run, it would not make sense to make it an optional dependency such thatGeoFrDataFrame
is a subset ofGeoDataFrame
ifgeopandas
is not available but is actually aGeoDataFrame
if geopandas is available.In terms of import, one would replace the import in
pynsee/geodata/__init__.py
byThis does not imply any breaking change as far as I can tell: we could keep the
get_geom
/translate
/zoom
functions that do not exist in geopandas (though it might be worth making a customGeoFrSeries
for improved compatibility).Opinions on that?