CartoDB / observatory-extension

BSD 3-Clause "New" or "Revised" License
6 stars 4 forks source link

Return only clipped/shoreline geometries #293

Closed ethervoid closed 7 years ago

ethervoid commented 7 years ago

Related #289, CartoDB/bigmetadata#175

IMPORTANT we should not break any existing analysis with this change

Right now we're returning both regular TIGER geometries and modified TIGER geometries to follow the shorelines and remove the water part of the geometries.

We should be using only the clipped version of the geometries whether is available. Right now we have it for US (with TIGER) and Whosonfirst.

We have an issue to modify, if possible, the data to include only the clipped version of the geometries when we have it but right now we want to filter in the extension to let the UI use it until we correct the data.

So we want to return the *_clipped, if possible, geometries when someone calls the OBS_GetAvailableGeometries function.

As a side note the clipped geometries are tagged as you can read in the provided link

javitonino commented 7 years ago

I was thinking, given that this function is public. I don't think we should unconditionally remove the geometries unconditionally, since this API is being used by other consumers (e.g: obs explorer) and potentially customers.

IMHO: We should either

P.S: I just realised it's harder than this because there is no direct association between one geometry and the clipped version in the denormalized metadata. It's only in the original tables: select * from observatory.obs_column_to_column where source_id like 'us.%' and reltype='cartography';. And doing the filter by common name prefix sounds wrong.

ethervoid commented 7 years ago

Good point, I agree with you on some points but from my point of view:

But I agree with you that we should try to fix this part in the UI. I didn't want to give more work to the Front guys but now we're making PR to fix this so I'm going to try to improve the query and fix this in the UI.

javitonino commented 7 years ago

Yeah, I was devil advocating a bit. The part that feels really wrong to me is this:

We are fixing a "bug" we have in that function, we shouldn't be returning non-clipped geometries from the beginning

If that is true, why do we even have non-clipped geometries in the database? I mean, I'm guessing there are some usages where that is useful if we have them, so I'd like to have an API to be able to see them.

Otherwise, yes, non-clipped geometries are probably not too useful, so maybe we can hide them altogether (and just keep the non-clipped data for legacy analyses).

ethervoid commented 7 years ago

If that is true, why do we even have non-clipped geometries in the database? I mean, I'm guessing there are some usages where that is useful if we have them, so I'd like to have an API to be able to see them.

Yes, probably this is legacy data until we create the clipped data from it and wasn't removed, from the metadata, due to the defensive reasons you're employing here. That's something we don't know and that is the worst part of this but changing in the UI it's a bit awkward for the user because in one side they see the filtered output but if they try to use it as a SQL function, they're going to get a completely different outcome.

I'm a bit concerned that we keep saving unused data but I agree with you that if we don't know what are the uses of this data we should keep it for now. I've added an issue to deal with the "problem".

ethervoid commented 7 years ago

Fixed in the UI. Closing this issue by now