Closed caseysmithpgh closed 1 year ago
@caseysmithpgh I don't see either of the BBLs you listed in the DTM that's on Digital Ocean. Based on this that's my rationalization as to why they are not in the reports. I'm happy to hop on a call to talk through and visualize you're process.
@AmandaDoyle interesting. Ok, I will take a look through the previous DTM and current DTM (and those that fell in between) to try to isolate the issue.
Hey Amanda, we just ran some double checks and are seeing the two BBLs in the latest DTM. We've checked most of the things we can think of on this end, and next steps would probably be to see what DE can dig up or to sit down together and figure out where we're going wrong.
This image is from ArcGIS, shows the data source in Cyberduck, the file pulled up in the map view, and the lots highlighted. Trying to be explicit with a screenshot in case we're separately looking at different datasets or something.
Also +@croswell81
@jackrosacker @caseysmithpgh I was looking at the dof_dtm dataset in edm-recipies here. I'm happy to meet to talk this through (unfortunately today is not a good day with meetings). If I'm not free please feel free to troubleshoot with someone in DE
@damonmcc for awareness and he'll be point of contact
gonna make sure the dof_dtm
data used by the build action (in edm-recipes
) is identical to the source of source data (in edm-publishing
)
if it isn't, I'll run our archiving and start a ZTLB build to see if that was the root cause 🤞🏾
seeing significant differences between input data, specifically dof_dtm
dot_dtm
appears to have 858,327 rowsdof_dtm
appears to have 858,317 rowsdof_dtm
appears to have 394,957 rowsfrom asking Max, the # of rows shouldn't change so much
@jackrosacker
the screenshot of the recent DOT DTM data you all have been uploading monthly to edm-publishing
:
@damonmcc I'm looking at the datasets pre-Digital Ocean upload and I'm seeing the following counts: | date | row count | column count |
---|---|---|---|
2022-12-30 | 858,328 | 33 | |
2023-01-27 | 858,318 | ||
2023-03-03 | 858,279 | 33 |
Edit: fixed to show 1/27 instead of 2/3
(for GIS internal reference, all three above are from the scrape output locations on M:\DOF_Tax_Maps)
@damonmcc I also confirmed the row counts for the corresponding shapefiles in edm-publishing for the three dates, all counts are the same as for the feature classes in the table above
I'm currently trying to convert the shapefile to a sql file locally and using the least amount of data-library code as possible to minimize the likelihood of losing rows and quickly upload the data needed to build ZTLB
so far, 20220429
appears to be the last source dataset in edm-publishing
. 20220603
is the very next dataset and has the "half file" issue we're seeing
use of QGIS to reproject and convert a shapefile seemed feasible, but a build with the result failed because of how QGIS structures the sql file (every line is a long INSERT
statement rather than our usual lines of text)
reproject with command line tools (ogr2ogr
and shp2pgsql
) is in progress and hopefully produces a sql file the ZTLB build process can use
update: seems promising!
very promising. here's the current approach
ogr2ogr
, converting to sql with shp2pgsql
edm-recipes
in two folders: latests and the latest date since the latter is what the build actually pulls inmost recently, the build failed because some geometries are invalid. fixing by reprojecting with --makevalid
flag
but got this error
ERROR 1: Attempt to write non-polygon (LINESTRING) geometry to POLYGON type shapefile.
ERROR 1: Unable to write feature 394957 from layer dof_dtm_tax_lot_polygon.
@jackrosacker @caseysmithpgh a successful build!
with new exports in edm-publishing/db-zoningtaxlots/latest/
, would love to have extra eyes on inspecting the results. perhaps worth clarifying February vs March releases since I'm not sure which one we're on and whether the input data for this build was ideal
@damonmcc @jackrosacker
I took a look at the edm-publishing/db-zoningtaxlots/latest/qc_bbldiffs.csv
and the record count is significantly higher, and more in line with what we would typically expect. The missing bbls that originally tipped us off to the issue are also included--so this output on that front looks good to me!
Many thanks Damon!
@damonmcc confirming that I'm clear to QA items that are in edm-publishing/db-zoningtaxlots/latest/
@caseysmithpgh yup!
and looks like this is meant to be for the February release so, in case it's helpful, here's a link to the source data versions in the build logs
per Data Update issue https://github.com/NYCPlanning/edm-overview/issues/866, ZTLDB has been QAed and pushed to Bytes. closing this issue as complete
Two BBLs that should be in Feb. open data qc_bbldiffs file were not included.
Findings from our initial investigation:
Below image shows new Feb. Zoning Map Amendment (blue highlighted selection), lots intersecting with this ZMA should be included in qc_bbldiff layer, but are not. Happy to follow up with screen share or in-person walk though of the issue.