astropy / astropy

Astronomy and astrophysics core library
https://www.astropy.org
BSD 3-Clause "New" or "Revised" License
4.37k stars 1.75k forks source link

`Distance` constructor could recognize the physical type of its input #12893

Open eerovaher opened 2 years ago

eerovaher commented 2 years ago

Description

The astropy.coordinates.distances.Distance class can be initialized from a length, parallax, distance modulus or redshift, but only length can be passed in without a corresponding keyword:

>>> from astropy.coordinates import Distance
>>> from astropy import units as u
>>> Distance(5*u.pc)
<Distance 5. pc>
>>> Distance(5*u.mas)
...
UnitTypeError: Distance instances require units equivalent to 'm', so cannot set it to 'mas'.
>>> Distance(parallax=5*u.mas)
<Distance 200. pc>

It could be convenient if the Distance constructor would recognize what its input is from the input's physical type without having to use the parallax, distmod and z keywords. Additionally, the removal of those keywords would make it impossible for the user to pass in multiple different inputs. However, the outlined functionality cannot be (fully) implemented for two reasons.

1) Currently it is possible to pass in a distance modulus without a unit:

>>> Distance(distmod=1*u.mag)
<Distance 15.84893192 pc>
>>> Distance(distmod=1)  # Also works
<Distance 15.84893192 pc>

2) Currently it is possible to pass in the distance value and its unit separately:

>>> Distance(3, unit=u.pc)
<Distance 3. pc>

Because the new API proposed here would always interpret an input without a unit as a redshift, the two cases listed above would have to be deprecated first. Is the proposed functionality worth the API change? If not then a more modest proposal of recognizing angels and magnitudes as parallaxes and distance moduli could be achieved without breaking the current API.

mhvk commented 2 years ago

While redshift is dimensionless, one could decide to explicitly recognize cosmology's redshift unit, and not worry about plain numbers. Indeed, I think a unit-less number is ambiguous: For a Distance, I'd expect the default to be 'm' rather than redshift, but even an interpretation as radians and thus parallax angle is not illogical. So, perhaps here it remains best to "refuse the temptation to guess". Indeed, "explicit is better than implicit" maybe suggests to leave things as they are -- in a way, the keyword arguments are similar to class methods which just allow initialization from a different type of distance measurement, which perhaps is good to request explicitly.

But then, I can also see the convenience, especially if one is writing, say, a function that would like as input something convertible to a Distance. So, I could get behind an infer-from-unit default that would continue not to work for non-quantity input and thus not change the API.

nstarman commented 2 years ago

@eerovaher I like this idea!

I don't think the unit argument has to be removed:

def __init__(self, value, unit=None):
    value = value if unit is None else value << unit
    # now do unit inference and rest of initialization
    ...

This way any input can be unit-less and still work with the proposed physical-type inference.

Or maybe the reverse.

def __init__(self, value, unit=None):
    unit = getattr(value, "unit", unit)
    if unit is None:
        raise ValueError(...)
    # do rest of initialization, using unit for physical-type inference
    ...

this latter isn't quite correct, as value.unit needs to be checked for compatibility with unit

mhvk commented 2 years ago

Second option seems better!

eerovaher commented 2 years ago

It looks like the ability to recognize the physical type of the input in the cases where it can be done unambiguously is considered a good idea as long as it doesn't break backwards compatibility. To be explicit: 1) An angle should be interpreted as parallax. 2) A magnitude should be interpreted as distance modulus. 3) An input with the cosmological redshift unit should be interpreted as redshift. 4) A plain float input together with a length unit passed in as unit should be interpreted like in the current API, not as a redshift.

But a couple of things should still be clarified. 1) Should an input without a unit be interpreted as a redshift if a cosmology is passed to the Distance constructor?

>>> Distance(3, unit=u.kpc)  # Just like in the current API
<Distance 3. kpc>
>>> Distance(3, unit=u.kpc, cosmology=WMAP1)  # Input is redshift because of `cosmology`
<Distance 25846533.95652958 kpc>

2) Should the parallax and distmod keywords be deprecated? Deprecating distmod would make it impossible to pass in a magnitude without an appropriate unit. Deprecating z would not be a good idea because of how common using plain floats to represent redshifts is.

mhvk commented 2 years ago

Hmm, while I guess it is reasonably unambiguous that no unit should be redshift if a cosmology is present, I'm not sure I like it too much. Definitely it feels weird to also have a unit in that case. Rather, one then should either do Distance(z=3, cosmology=WMAP1, unit=u.kpc) or Distance(3, cosmology=WMAP1).to(u.kpc).

I should add that now I look at the present code, I'm no longer sure about either of @nstarman's suggestions. Normally, for Quantity the goal of unit generally is not to convert but just to assign the unit for unit-less input, so naive interpretation of Distance(3, unit=u.Gpc, cosmology=WMAP1) would be a distance of 3 Gpc with a cosmology stored on the class that might be used for later conversion (I know the class doesn't store the cosmology, but it just feels confusing nevertheless). But for Distance, somewhat inconsistently, it seems the goal is to set the final unit. I can see this working for distmod, z, and parallax, but it really feels weird without an input unit.

Overall, I think I'd rather keep the restriction that any value without unit needs an explicit unit of length passed for Distance to succeed.

I don't see a real need to deprecate parallax and distmod - it would surely break code for little benefit.

eerovaher commented 2 years ago

I don't see a real need to deprecate parallax and distmod - it would surely break code for little benefit.

Removing the additional keyword arguments for the input values would make it impossible to pass in more than one input, so the Distance constructor could be simplified. But removing parallax and distmod while keeping z would not allow the code to be simplified very much (if at all), so it might indeed be better to avoid breaking backwards compatibility.

If we are keeping distmod then should the possibility of passing in a distmod without a unit be considered a bug?

mhvk commented 2 years ago

I'd keep the behaviour the same as it presently is for the case of passing in anything via the keyword arguments (i.e., only add the introspection of the unit of value).