pytroll / pyresample

Geospatial image resampling in Python
http://pyresample.readthedocs.org
GNU Lesser General Public License v3.0
344 stars 95 forks source link

Handle value-less parameters in `proj4_dict_to_str` #522

Closed ethantkoenig closed 1 year ago

ethantkoenig commented 1 year ago

Fix bug in AreaDefinition.proj_str where value-less parameters are given a value of None:

>>> ad = pyresample.AreaDefinition(
...     area_id='area_id',
...     description='description',
...     proj_id='tasmania_utm',
...     projection='+ellps=WGS84 +no_defs +proj=utm +south +towgs84=0,0,0 +type=crs +units=m +zone=50',
...     area_extent=(701609.0, 5127659.0, 901609.0, 5327659.0),
...     height=10,
...     width=10,
... )
>>> 
>>> ad.proj_str
(path to venv)/lib/python3.11/site-packages/pyproj/crs/crs.py:1286: UserWarning: You will likely lose important projection information when converting to a PROJ string from another format. See: https://proj.org/faq.html#what-is-the-best-format-for-describing-coordinate-reference-systems
  proj = self._crs.to_proj4(version=version)
'+ellps=WGS84 +no_defs +proj=utm +south=None +type=crs +units=m +zone=50'
djhoese commented 1 year ago

Thanks for working on this. In old versions of pyproj we did all the dict to string handling ourselves, but now we use pyproj's CRS object a lot more.

Does this match what the pyproj library does with CRS.to_proj4()? Actually, how do you feel about just using pyproj's CRS here? In future versions of pyresample and with semi-recent updates already, we've been phasing out the direct use of PROJ.4 in pyresample internals. It would be best to depend on pyproj and the pyproj CRS object as much as possible.

Thoughts?

codecov[bot] commented 1 year ago

Codecov Report

Merging #522 (0b2806f) into main (7a434c1) will increase coverage by 0.01%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #522      +/-   ##
==========================================
+ Coverage   94.32%   94.33%   +0.01%     
==========================================
  Files          79       79              
  Lines       12976    12983       +7     
==========================================
+ Hits        12240    12248       +8     
+ Misses        736      735       -1     
Flag Coverage Δ
unittests 94.33% <100.00%> (+0.01%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pyresample/test/test_utils.py 100.00% <100.00%> (ø)
pyresample/utils/proj4.py 100.00% <100.00%> (ø)

... and 1 file with indirect coverage changes

coveralls commented 1 year ago

Coverage Status

coverage: 93.939% (+0.003%) from 93.936% when pulling 0b2806f0528133d9db2260416ff58ef86bd8eb72 on ethantkoenig:main into 7a434c116a256b155c75db2ec99765abdd02035d on pytroll:main.

ethantkoenig commented 1 year ago

Does this match what the pyproj library does with CRS.to_proj4()?

CRS.to_proj4() delegates to proj_as_proj_string. I haven't look at the PROJ implementation of proj_as_proj_string in detail, but I'm reasonably confident it does not append =None :)

Actually, how do you feel about just using pyproj's CRS here?

CRS doesn't seem to support sorted strings, so it can't be used for proj4_dict_to_str calls where sort is true.

djhoese commented 1 year ago

CRS.to_proj4() delegates to proj_as_proj_string. I haven't look at the PROJ implementation of proj_as_proj_string in detail, but I'm reasonably confident it does not append =None :)

Haha. I think you're right, but I think it does do None or "null" when it creates a dict of PROJ.4 parameters. The dict case is I think why pyresample does it.

CRS doesn't seem to support sorted strings, so it can't be used for proj4_dict_to_str calls where sort is true.

Ah good catch. Do you use sort=True in your own work/use case? This used to be (if I remember correctly) to help with hashing/caching and equality checks, but we depend completely on pyproj/PROJ for that now through the CRS objects.

I'd be willing to deprecate the sort option and/or push this functionality upstream if at all possible into pyproj or PROJ. I have a feeling though that those maintainers would prefer not to mess with PROJ.4 strings anymore.

@mraspaud @pnuu what do you think about deprecating the ability to sort PROJ.4 strings alphabetically?

mraspaud commented 1 year ago

I'm guessing the alphabetical sorting what to help comparing proj dicts in python2, so I don't mind removing that possibility

djhoese commented 1 year ago

I think I'd prefer maybe if we right tests that compare the PROJ dict rather than 2 strings that depend on order and could be wrong if pyproj adds additional arguments. What do you two (@ethantkoenig @mraspaud) think? Otherwise, what did we think about removing the sorting functionality and just depending on pyproj to do the conversion?

ethantkoenig commented 1 year ago

I think I'd prefer maybe if we right tests that compare the PROJ dict rather than 2 strings that depend on order and could be wrong if pyproj adds additional arguments. What do you two (@ethantkoenig @mraspaud) think?

Sure, that works for me. Done.

djhoese commented 1 year ago

I've added this to the milestone for the next minor (feature) release of pyresample (I'll do a bug fix release before this release is worked on). I think I'd like this PR to evolve to be as I described above where pyproj is used for the conversion to a string. I think sort functionality should be deprecated and a warning added if it is specified:

warnings.warn("The 'sort' option is no longer supported. Please use the comparison of pyproj 'CRS' objects instead.")

I'd be OK with completely removing the functionality so sort no longer does anything (has no effect) until it is completely removed. I'm 95% sure pyresample internals is the only place where sort used to be used.

@mraspaud @ethantkoenig thoughts?

Edit: I don't see any uses of the function in Satpy or of sort=True. This would be our (the maintainers of pyresample) biggest user of this functionality.

mraspaud commented 1 year ago

just to confirm, I don't remember this arg being used anywhere else.

mraspaud commented 1 year ago

To clarify @djhoese do you want this PR to use pyproj, or can we merge this and make the pyproj version in a following PR?

djhoese commented 1 year ago

To clarify @djhoese do you want this PR to use pyproj, or can we merge this and make the pyproj version in a following PR?

It wasn't clear to me from past comments (or maybe I'm forgetting or missed it) if @ethantkoenig was unable to do something in their own work because of the way value-less parameters are handled. If there is any type of semi-urgent need for this then this PR as-is can be merged. Otherwise, I don't see a reason to not switch to pyproj. With no urgency I see two options:

  1. Remove all sorted behavior and switch completely to pyproj for this conversion. This is the harshest option and will break anyone's code that was dependent on sort. But also I'd want to know who is using it as soon as possible.
  2. Add deprecation warning if sort=True is passed but still use the existing version of the function (with the fixes already suggested by @ethantkoenig in this PR for value-less parameters) but otherwise (sort=False) use pyproj for the conversion.

I suppose there is an inbetween option where sort=True warns but also has no effect.

ethantkoenig commented 1 year ago

It wasn't clear to me from past comments (or maybe I'm forgetting or missed it) if @ethantkoenig was unable to do something in their own work because of the way value-less parameters are handled. If there is any type of semi-urgent need for this then this PR as-is can be merged. Otherwise, I don't see a reason to not switch to pyproj. With no urgency I see two options:

I have a slight preference for submitting the bug fix as-is and pursuing a long-term solution in a separate PR, since there seems to be a not-yet-resolved discussion about exactly how to pursue a long-term solution. But I'm also willing to start the envisioned pyproj refactor in this PR if you feel strongly about marrying the two.

  1. Remove all sorted behavior and switch completely to pyproj for this conversion. This is the harshest option and will break anyone's code that was dependent on sort. But also I'd want to know who is using it as soon as possible.
  2. Add deprecation warning if sort=True is passed but still use the existing version of the function (with the fixes already suggested by @ethantkoenig in this PR for value-less parameters) but otherwise (sort=False) use pyproj for the conversion.

I suppose there is an inbetween option where sort=True warns but also has no effect.

I don't have a strong preference among any of these options (unless auditing whether anyone actually uses sort for option 1 would take time, in which case it'd be nice to have a quicker short-term fix).