Open antoniocarlon opened 6 years ago
In the boundaries document, @andrewbt pointed some reasons that he could expand here giving links, use cases and so on to have a complete picture of the cons.
Repasting my feature doc comment:
[Users] may want non-shoreline-clipped if they are comparing to other Census data from 3rd party sources and doing analytic or comparison work there and not just beautiful cartography. Census tracts and boundaries do have legal/[official] definitions somewhere, and I assume those definitions do not include the shoreline clipping...what if someone lives in a houseboat in the [NYC] bay and that is their census reported address? I like the idea of a checkbox though - it could even default to clicked.
Want to stress that I'm making some assumptions there, and I also don't understand the full scope of this debate. For my clarification, which or both of these is true: 1) You're proposing to make the DO analysis (ie, "user interface") only have shoreline-clipped boundaries available? 2) You're proposing to remove non-shoreline-clipped boundaries from the DO entirely so they would not even be accessible via SQL query?
Number 1 sounds more reasonable to me and could make sense if it helps close some dependency issues ("putas dependencias" 😂 ). Even so, if it's fixable I still imagine some users wanting to use the UI to select non-shoreline-clipped options, particularly for comparisons against 3rd-party non-DO-origin datasets. Number 2 though, I would be even more skeptical about (for the same reason/use case).
@CartoDB/research @michellemho ^^
and in the spirit of open source maybe @talos 😄
Other than the rare edge case with houseboats, I can't think of a good reason why anyone would want to use non-shoreline-clipped boundaries for cartographic visualization or geographic analysis purposes.
That being said, I agree with Andrew's "1" option, that we should keep non-shoreline-clipped boundaries accessible via a SQL query because they are the official US Census TIGER shapes.
For good measure, I also emailed the U.S. Census Bureau to ask why they include water areas in TIGER boundaries. Maybe they can give us more insights!
I'd also prefer to leave the non-shoreline-clipped available via SQL as long as there isn't some major issue to overcome by keeping them.
So glad to see this alive and kicking! Great work y'all. :)
I recall two reasons for having non-shoreline clipped boundaries:
Oh, and RE why the census includes water areas, my understanding:
I forgot you changed your GH handle, CC @andy-esch above.
@talos 👋 hey! Thanks for the reasoning! I emailed the US Census Bureau about the inclusion of water areas, but all I got was a link to cartography-ready boundaries and some confusion.
one example is counties in the Great Lakes
Yeah having the non-shoreline clipped boundaries is important for determining some adjacency matrices that can be used in certain models. It's important to keep the in the DO for that and aggregation use-cases. That being said, any cartography should be done using the shoreline clipped versions.
Based on the discussion, I'm hearing it would be OK to remove the non-clipped geometries as an option from the Data Observatory analyses and Builder user interfaces, but do NOT remove them from the DO entirely and make sure they are still accessible via the SQL interfaces (for stuart's models and so on).
Does that help with the circular dependency issue @antoniocarlon @ethervoid ? Or is it just as difficult to make both versions available everywhere?
The main goal of this discussion is not removing the non-clipped geometries to solve the circular references problem. We think that we already have found a solution for that problem.
The main goal is to understand why we need the non-clipped geometries from a conceptual point of view so we all know what they are, what they mean and what they can be used for (in my particular case I didn't understand why they are useful and this debate is helping me to grasp some new and interesting concepts).
That said, it seems that you agree that we need them and there are arguments to keep them.
@makella could you elaborate a bit more the problem that you are showing? Is it something related to the boundaries? This is an example of the state geometries that we currently have in DO (clipped vs non-clipped geometries in that area):
I definitely support keeping non-clipped geoms. Some people need the real thing for analysis and expect that if we're offering tiger geometries, they can use CARTO as a source for the census-provided ones.
To @antoniocarlon and everyone else interested in this conversation, copied below is the comprehensive response I received from the US Census Bureau Geographic Customer Service Team regarding the purpose and existence of non-clipped geometries (emphasis is mine) :
The TIGER/Line Shapefiles are a product that is generated directly from the MAF/TIGER database. MAF/TIGER is a transaction database that stores the full geographic extent and boundaries of both legal and statistical entities as well as a variety of features (roads, linear and areal water, railroads, etc.) needed to support Census Bureau mapping, geographic area delineation, and data collection activities. In addition, the boundary and other geographic area information in MAF/TIGER supports the Census Bureau's role as the federal agency responsible for maintaining the legal boundaries for all geographic entities in the United States. For this reason, we maintain offshore boundaries out to the 3-mile limit rather than clipping to the shoreline within the MAF/TIGER database itself. While this can produce shapes that look odd to those accustomed to viewing the outlines of geographic areas, particularly states, clipped to shorelines, maintenance and depiction of the legal, offshore limits of entities are critical to some users of our spatial data. There is no need, however, for offshore boundaries from the standpoint tabulating and disseminating statistical data.
Our Cartographic Products and Services Branch (CPSB) staff can certainly relate to your concerns from a data visualization perspective. We encounter decisions about scale and generalization for a lot of our own map products. You are correct that water areas shift relatively frequently, and the desired precision of a water boundary varies greatly depending on zoom level and map purpose. CPSB uses a generalized version of a coastline feature to clip boundaries in our Cartographic Boundary File products which are designed for small scale mapping.
Thank you @michellemho, the purpose of the non-clipped geometries is now clear
Now that we have found some problems related to circular dependencies in Tiger (see #403) related to having both shoreline clipped and non-clipped geometries, and althought it's almost solved, I would like to reopen de debate about the need of offering both geometries for DO analysis or use only the shoreline clipped ones.
These are the facts that we need to take into account:
We could add our thoughts and comments to this issue and share any relevant link and piece of documentation here.
cc @hannahblue @ethervoid