Closed martinvonk closed 1 week ago
This is the reason why #209 fails
Happened to me after updating pandas to v.2.2.2 Updating pandas (2.2.1 -> 2.2.2)
Not sure if we need to do something with this. #209 is already fixed. Maybe if behavior keeps existing after v2.2.2 we need to adress this.
It has become an issue in the bro_bronhouder notebook now as well. I think it has something to do with this error: https://github.com/pandas-dev/pandas/issues/57032
Geopandas had the same problem: https://github.com/geopandas/geopandas/issues/3060
For now I will pin the pandas version to 2.2.1 or lower because this also gives errors in nlmod.
It seems like a change in pandas behavior that caused this issue can actually help to improve hydropandas.
Behavior pandas <=2.2.1:
oc.loc[:, ["x", "y"]]
returned an ObsCollection. Just like in geopandas:
gdf.loc[:, ['id']]
would return a GeoDataframe
Recent changes to pandas (=2.2.2) changed this behavior in such a way that:
oc.loc[:, ["x", "y"]]
returns a DataFrame. Due to this behavior the ObsCollection constructor would attach an Obs column to it. I was quite surprised that the loc function calls the constructor when it returns an object but apparently it does.
I added an extra check to the constructor to see if it was called using a DataFrame and an obs_list/ObsClass. Only if one of the obs_list/ObsClass is given the 'obs' column is attached.
We should improve hydropandas further by following the behavior of Geopandas. If you slice an ObsCollection without the 'obs' column a DataFrame should be returned. Now an ObsCollection without the 'obs' column is returned which is an invalid ObsCollection.
We could even think about supporting an ObsSeries. Atm when you obtain a single column from an ObsCollection it will always return a pandas Series. For now I don't really like to do this. I think this adds more headache than hooray.
Not sure why this happens but when I use
loc
on the ObsCollection the column with the "obs" is always returnedSame happens with iloc
oc.iloc[:, [0,1]]