scikit-hep / vector

Vector classes and utilities
https://vector.readthedocs.io
BSD 3-Clause "New" or "Revised" License
77 stars 24 forks source link

vector.obj does not recognize "rho" "phi" "eta" "tau" as Momentum4D Object #482

Closed Superharz closed 2 months ago

Superharz commented 2 months ago

Vector Version

1.4.1

Python Version

3.11.9

OS / Environment

Windows 11 64 bit Conda-Forge environment

Describe the bug

Observed behavior:

vector.obj(**{"rho":0, "phi":0, "eta":0, "m":0})
MomentumObject4D(pt=0, phi=0, eta=0, mass=0)

vector.obj(**{"rho":0, "phi":0, "eta":0, "tau":0})
VectorObject4D(rho=0, phi=0, eta=0, tau=0)

The same is observed for changing between rho and pt. If either (or both) pt or m is used as a keyword the result is a MomentumObject.

Expected behavior:

The resulting object in both cases should be a MomentumObject4D

Any additional but relevant log output

No response

jpivarski commented 2 months ago

The observed behavior is intended: the vector is a momentum if and only if any of the component names are momentum synonyms, such as "pt" instead of "rho" or "mass" instead of "tau". "phi" and "eta" are both angles, purely geometric.

(One could argue that "eta" should be momentum-like because it's pseudorapidity, a property that's only associated with particle momenta. But pseudorapidity can be defined for any 3-vector (unlike rapidity), and in Vector's system, all of the momentum names are synonyms of some pure geometry name, and there's no other word for "pseudorapidity.")

jpivarski commented 2 months ago

I think this is not a bug, and I'll be closing this issue. (Issues can be reopened.)

Superharz commented 2 months ago

Thank you for your explanation. I read the description wrong as that m and tau are "real" synonyms and could be used interchangeable without causing any difference.

However, then I don't understand why the branches of a awkward Momentum4D arrays are named rho, phi, eta, tau as this naming scheme would normally create a Vector4D array which causes problems as observed in https://github.com/scikit-hep/vector/issues/283#issuecomment-2188821272.

jpivarski commented 2 months ago

In hindsight, we probably shouldn't have created a distinction between "momentum vectors" and "geometric vectors" because it hasn't been understood without explanation. The reason we wanted to was because we couldn't exclude names like pt and it felt wrong to talk about the $p_T$ of a displacement vector.

I'm a little unclear about the semantics of vector.awk and therefore generally avoid it. When I present Vector in tutorials, I do

import vector
vector.register_awkward()

and then use ak.with_name or the with_name argument of ak.Array, ak.zip, etc. to ensure that the record name is "Momentum4D" (or "Vector4D", "Momentum3D", etc.). None of Vector's constructors are involved, just Awkward Arrays, and after vector.register_awkward(), any Awkward records with the appropriate names and fields are interpreted as exactly the type of vector you want.

This works through option-types (because with_name knows that names apply to record nodes, not option nodes).

>>> import awkward as ak
>>> import vector
>>> vector.register_awkward()
>>> a = ak.Array([{"rho": 0, "eta": 0, "phi": 0, "tau": 1}, None], with_name="Momentum4D")
>>> a.show(type=True)
type: 2 * ?Momentum4D[
    rho: int64,
    eta: int64,
    phi: int64,
    tau: int64
]
[{rho: 0, eta: 0, phi: 0, tau: 1},
 None]
>>> a.rapidity
<Array [0, None] type='2 * ?float64'>