Closed td928 closed 2 years ago
okay, seems like I am still having problems when installing geopandas inside the container. The error messages says a GDAL API version must be specified. I will try to investigate it first.
I think a basemap would be a nice improvement but don't think it needs to be implemented in this PR. As is, it (would) show us if there any records that are totally egregiously outside NYC's boundaries but after doing some review of records in the table that are falling outside of the NYC boundaries (water included) they should be getting captured as being within NYC. I don't know how much time should be spent on this as its only 40 ~ records and they are mapping in the Capital Planning Explorer. I included a screenshot of a record that I looked at
@Oysters1874 I didn't have any issues installing geopandas in the container, let me know if you still have issues
went with a slightly different implementation for the shapefile conflict issue but should be similar concept though. Add a random feature branch for testing and let me know if it makes sense @abrieff Thanks!
@Oysters1874 I didn't have any issues installing geopandas in the container, let me know if you still have issues
thank you so much! I have figured it out. Now it works.
looks good on my side as well
One tiny thing: there are two 'the' at the beginning of the description.
86 at least two reviewers required 🏘️ (I am not keeping completely straight with the emoji yet but this is more than 1 i am guessing?)
Overview
the last piece of the current CPDB qaqc iteration. Visualizing the two main geometries files in order to assess whether something out of wack is going on like we saw previously after the snap to grid fix. It took longer than I anticipated not because the visualization itself because I took on refactoring the data ingestion process.
geometry_visualization_report
I did not end up getting a direct streaming of the files from DO but opted the approach to save the extracted geometries files locally then read in to geopandas. Once this is done, the visualization itself is pretty straight forward with geopandas plotting functionalities. I do wonder if an improvement to make is to give it a basemap but it requires some thinking around what tool the basemap should come in from. My go-to is plotly but plotly does not like plotting with shapefiles so some conversion of polygons is required. If my thinking is more convoluted than it needs to be on this point please let me know.
helpers.py
helpers function are consolidated which mostly surrounds the data ingestion process. I think in the end @abrieff approach won out with now the getting zip files and getting csv looks much more alike and functions are dumber in that they took a single url as way to find their way in s3. On the other hand, the return object is stored similarly as in my other work as well as Jingyi's which is dictionary with name of the tables as keys.