PermafrostDiscoveryGateway / viz-staging

PDG Visualization staging pipeline
Apache License 2.0
2 stars 1 forks source link

Convert input data CRS to config's `input_crs` only if there is no CRS in input data #26

Open julietcohen opened 11 months ago

julietcohen commented 11 months ago

In the config, there is an option to set input_crs (see here) which was intended to be used when the input data lacks CRS information, which was the case with some early ice wedge polygon data. However, the way set_crs() is currently configured here in TileStager.py, the CRS of input data is set to the input_crs if the value in this in the config is not None. To be clear, the way the operation is set at the moment, the data is not transformed. See the documentation for geopandas set_crs() here.

We need the data to only be set to the value of input_crs when it is not None and the input data does not already have CRS info.

robyngit commented 11 months ago

I was thinking about this @julietcohen... Do you think there will ever be a scenario where we would want to correct an existing CRS in a dataset? set_crs would already work for that case, but maybe we would want an option like replace_crs (True/False) to indicate when the given CRS needs to be corrected and when it needs to be set only for files missing that info? or maybe that would be option overload!

julietcohen commented 9 months ago

@robyngit That's a good idea, it does seem likely that one day we will receive data with incorrect CRS information. I have not encountered this before, but maybe instead of including another option like replace_crs, we could include a check for the actual CRS of the geometry using geodataframe.geometry.crs and compare it to what is returned by geodataframe.crs. Then if they are not the same, we use set_crs() to correct the geodataframe CRS info

julietcohen commented 9 months ago

Seems like the output of both geodataframe.geometry.crs and geodataframe.crs are changed by set_crs() even though set_crs() doesn't transform the data ~so we would need anther way to check the actual CRS of the geometries and not just the metadata~ and there's no way to check the CRS of geometries besides the metadata