Open emiliom opened 5 years ago
@emiliom, I really like this idea, as I think it is fundamental to the purpose of having a functional and performant Python API. I think the extra requirements are a small price to pay for also getting efficient and tested I/O capabilities.
Thanks for continuing to move this critical repo forward.
@emiliom - I don't have a really strong feeling about this other than I think we should be very careful about adding additional requirements and complexity. My feeling is that we never finished the core functionality and so adding additional functionality and dependencies should perhaps be secondary to firming up the foundation.
Utility functions would be nice. Is there ongoing work that's driving this?
@horsburgh, good question. I'm also interested in hearing what is motivating this work!
I agree with the points about managing complexity and need to better develop core functionality. I also believe that -- given that Pandas has become a core part of the standard Python computational science and data science stack -- that we should consider strong integration with Pandas and GeoPandas as core functionality. This is especially true given that one of the highest priorities we've heard from users and potential users is to improve I/O performance (including data alignment and slicing), and that is one of the main purposes/advantages of using Pandas.
For my own future reference, to be moved into new issues when I'm ready to work on this stuff.
From the WaterQualityMeasurements_RetrieveVisualize.ipynb example in the odm2api documentation.
# set the index to ValueDateTime for convenience.
tsValues = read.getResultValues(resultids=[1], lowercols=False)
tsValues.set_index('ValueDateTime', inplace=True)
tsValues.sort_index(inplace=True)
And to conveniently unpack relevant metadata, on variable names and units, use something like tsResult.VariableObj.VariableNameCV
and tsResult.UnitsObj.UnitsAbbreviation
.
Starting point for ingesting Sites into a GeoDataFrame. From the WaterQualityMeasurements_RetrieveVisualize.ipynb example in the odm2api documentation.
import geopandas as gpd
# Get all of the SamplingFeatures from the ODM2 database that are Sites
siteFeatures = read.getSamplingFeatures(sftype='Site')
# Read Sites records into a Pandas DataFrame
# "if sf.Latitude" is used only to instantiate/read Site attributes)
df = pd.DataFrame.from_records([vars(sf) for sf in siteFeatures if sf.Latitude])
# Create a GeoPandas GeoDataFrame from Sites DataFrame
ptgeom = [Point(xy) for xy in zip(df['Longitude'], df['Latitude'])]
gdf = gpd.GeoDataFrame(df, geometry=ptgeom, crs={'init': 'epsg:4326'})
@emiliom, thanks for all your work on this!