CityOfNewYork / nyc-planimetrics

New York City Planimetrics Data
87 stars 24 forks source link

Some layers might not follow OGC Standard? Issues in QGIS #47

Open mtreg opened 5 months ago

mtreg commented 5 months ago

Hi there - I was so excited to hear that the 2022 Planimetric data are available! I've been starting to explore/work with the data (from here and this is fantastic!

An issue I've encountered (the only one) is that at least some do not render correctly when loaded into QGIS. I have not given a detailed look at all of the layers, but have seen that for both the Roadbed and Shoreline layers, they will render okay when zoomed out, but fail to render when I zoom in. For both of these layers, running the "Fix Geometries" algorithm using the "Linework" repair methods seems to fix this issue best I can tell (with the caveat that I haven't given a good look in ArcPro to compare, but the original layers seem to perform well there).

Here's a screenshot of that menu for easy reference: image

For the Roadbed layer, the "Force right-hand-rule" tool also seems to work (but I believe that is only applicable to polygons).

Given the issues I'm seeing, and what fixes work, it seems like maybe these are not necessarily following the OGC standard geometry definition, but rather the Esri standard? I wonder if these issues could be addressed in the data available through official channels to support a broad base of users who may use a wide variety of tools.

Happy to connect directly or do additional testing if helpful. And in case it's helpful, here is the info on my QGIS install: image

Thanks! Mike

mattyschell commented 5 months ago

@mtreg thanks for reporting this! And also for investigating and resolving it. Resolved issues are the best issues.

I can confirm everything that you report. Downloading the planimetrics_2022.gdb from https://nyc.maps.arcgis.com/home/item.html?id=4b01b78d9eda44819f6c757ec00d0669

  1. ROADBED in the file geodatabase initially renders in QGIS but disappears when I zoom in
  2. All ROADBED geometries are reported valid using the ArcGIS Pro "Check Geometry" tool and Validation Method: ESRI
  3. Many ROADBED geometries are reported not valid using the ArcGIS Pro "Check Geometry " tool and Validation Method: OGC.
  4. Some of the OGC invalid problems are "non-simple" and "self-intersection" which are the types of OGC vs ESRI validity rules we expect to be irreconcilable.

Your comment about supporting a broad base of users is a good one. I'll discuss this with my colleagues and report back.

mattyschell commented 5 months ago

The REST services seem OK in QGIS.

Roadbed https://services6.arcgis.com/yG5s3afENB5iO9fj/arcgis/rest/services/Roadbed_2022/FeatureServer/20 image

Shoreline https://services6.arcgis.com/yG5s3afENB5iO9fj/arcgis/rest/services/Shoreline_2022/FeatureServer image

mtreg commented 5 months ago

@mattyschell - thanks for the quick response, looking into these issues, and checking in with your team!

In case it helps for some context re: supporting a broad user base: some of my work includes helping capacity among partners for working with spatial data in various ways (mostly for considering environmental data) - and orgs may not access to Esri software. Thus, while my own work is typically done in open source software, largely for open science and reproducibility reasons, the training/capacity building I do is in QGIS to help folks use the available data. Of course, data issues do come up from time to time, but if data can be made to "just work" in the various types of software (e.g., Esri and OGC-standard based), it can reduce the barriers for folks working with the data, especially those who are newer to GIS and data analysis more broadly - and this can certainly help the great data that your team and others publish get use from various perspectives.

Its awesome that the REST services seem to work fine in QGIS - thanks for pointing that out! and that certainly helps address things from a visualization and some analysis perspectives. Though having the data work as expected when downloaded will also be invaluable, reducing steps needed for folks to get up and running with it.

Happy to connect further on this if helpful; chatting here is fine, and you can also email me at michael.treglia@tnc.org.

Thanks again!

mattyschell commented 4 months ago

Thanks for the detailed context @mtreg. We always appreciate hearing about real users and use cases that can guide this work.

We discussed this internally and think that a zipped archive of shapefiles is our best option for supporting QGIS users. NYC Open Data will publish the 2022 planimetrics layers as shapefiles. They have not yet published.

I took a look at SHORELINE.shp and ROADBED.shp in QGIS and both render at all zooms without issue. Interestingly SHORELINE.shp is fully valid using both OGC and ESRI methods. However ROADBED.shp is invalid using OGC and also invalid using ESRI methods!

I don't know enough about QGIS to explain this. Perhaps all bets are off for shapefiles and QGIS does some processing gymnastics to ensure shapefile geometries "just work." Maybe you know?

Let us know what you think about the plan to direct QGIS users to shapefiles on NYC Open Data. I'll file an issue to update the documentation so I don't forget about it.

mtreg commented 4 months ago

Thanks so much for this @mattyschell and sorry for my slow response. I think my ideal would be to have something follows OGC standards as the single source, as those should work fine in Esri products as well. I think some of the issue that comes up with gdb files is they sometimes include mutliple geometry types in a single layer (e.g., multicurve with multiline), but that might not be the case with shapefiles(?)...

Ultimately, there might be multiple issues of non-conformance to OGC standards, and they all might cause issues for users looking to do data processing, but some might not be an issue for visualization.

I wonder if there's an Esri tool to transform data to use OGC standards, or if the team would be comfortable using QGIS and underlying libraries to fix issues in what's available for all, since that QGIS fix I mentioned above seemed to work.

In my workflows, I know there are sometimes things I need to work through in this realm... I do a lot with PostgreSQL/PostGIS I often run st_makevalid to get ahead of issues with topolgy that might come up... but sometimes the multiple geometry types requires some additional pre-processing before really using the data in certain tools. It's a lot of overhead to learn for a lot of folks, so simplifying with one dataset that works everywhere would be awesome if possible.

For what its worth, I know these challenges are pretty pervasive... this is just one I caught and there was an easy way to give feedback :-) For example, I just discovered a zoning dataset from DCP's Bytes of the Big Apple that seems to have some toplogy issues I hadn't realized before... its unfortunately something I've largely gotten used to working through, but maybe some broader changes are possible?

Anyhow, hope that helps, and happy to connect directly if helpful. And apologies for the slow response!