Open julietcohen opened 1 year ago
I was thinking about this @julietcohen... Do you think there will ever be a scenario where we would want to correct an existing CRS in a dataset? set_crs
would already work for that case, but maybe we would want an option like replace_crs
(True/False) to indicate when the given CRS needs to be corrected and when it needs to be set only for files missing that info? or maybe that would be option overload!
@robyngit That's a good idea, it does seem likely that one day we will receive data with incorrect CRS information. I have not encountered this before, but maybe instead of including another option like replace_crs
, we could include a check for the actual CRS of the geometry using geodataframe.geometry.crs
and compare it to what is returned by geodataframe.crs
. Then if they are not the same, we use set_crs()
to correct the geodataframe CRS info
Seems like the output of both geodataframe.geometry.crs
and geodataframe.crs
are changed by set_crs()
even though set_crs()
doesn't transform the data ~so we would need anther way to check the actual CRS of the geometries and not just the metadata~ and there's no way to check the CRS of geometries besides the metadata
To clarify: In the current code, here are the 4 possible scenarios and their outcome:
Does input data already have a CRS set ? | Does input_crs config option have a value besides None? |
Result |
---|---|---|
yes | yes | data is set to the CRS defined as input_crs option in the config, then transformed to the CRS of the TMS |
yes | no | data is transformed to the CRS of the TMS |
no | yes | data is set to the CRS defined as input_crs option in the config, then transformed to the CRS of the TMS |
no | no | data is transformed to the CRS of the TMS |
So considering options 1 and 3, if input_crs
has a value, then the input data is set to that CRS regardless if the input data already has a CRS. Perhaps this was the intentional purpose of this option in order to correct an incorrectly set CRS. If not, then the code should be adjusted.
In the config, there is an option to set
input_crs
(see here) which was intended to be used when the input data lacks CRS information, which was the case with some early ice wedge polygon data. However, the wayset_crs()
is currently configured here inTileStager.py
, the CRS of input data is set to theinput_crs
if the value in this in the config is not None. To be clear, the way the operation is set at the moment, the data is not transformed. See the documentation for geopandasset_crs()
here.We need the data to only be set to the value of
input_crs
when it is not None and the input data does not already have CRS info.