SWIFTSIM / swiftsimio

Python library for reading SWIFT data. Uses unyt and h5py.
GNU Lesser General Public License v3.0
15 stars 12 forks source link

User input using custom yt units can trigger silent incorrect unit conversions (e.g. in spatial masking) #173

Open kyleaoman opened 8 months ago

kyleaoman commented 8 months ago

I've run into a conflict with custom yt units that are aware of comoving/physical conversions. This came up in the context of caesar, here's a minimal example (you need a caesar output at z>0 to conceptually reproduce this, the last line assumes z=1).

import unyt as u
from swiftsimio import *
from yt.units.yt_array import YTArray, UnitRegistry

caesar_file = 's12.5n128_0012.hdf5'  # just to get an example unit registry, note that this is a z=1.0 output
with h5py.File(caesar_file, 'r') as f:
    registry = UnitRegistry.from_json(f.attrs["unit_registry_json"])

cosmo_array([0.8], u.Mpc, comoving=True, cosmo_factor=cosmo_factor(a**1, scale_factor=0.5)) > YTArray([1], 'Mpccm', registry)  # True (!!), but issues a warning about missing cosmo_factors

This is acceptable because there's a warning (I added that in at some point when working on the cosmo_array source code). However things can get worse, for instance if I try to define a mask region from something in this system of units, I get nonsense because internally there's a comparison to a unyt_quantity or unyt_array rather than a cosmo_array, which triggers a conversion of the YTAarray into physical units.

import unyt as u
from swiftsimio import *
from yt.units.yt_array import YTArray, UnitRegistry

s = 'simba_s12.5n128_0012.hdf5'
caesar_file = 's12.5n128_0012.hdf5'
m = mask(s)
b = m.metadata.boxsize[0].to_value(u.Mpc)  # raw float
with h5py.File(caesar_file, 'r') as f:
    registry = UnitRegistry.from_json(f.attrs["unit_registry_json"])

# what we expect:
region = u.unyt_array([[0, 0.49*b], [0, 0.49*b], [0, 0.49*b]], u.Mpc)
m.constrain_spatial(region)
print(m.cell_mask.sum(), m.cell_mask.size)  # 512 4096, i.e. an octant

# what we get:
yt_region = YTArray([[0, 0.49*b], [0, 0.49*b], [0, 0.49*b]], 'Mpccm', registry)
m.constrain_spatial(yt_region)
print(m.cell_mask.sum(), m.cell_mask.size)  # 64 4096, i.e. shrunk by a factor of (1+z)**3

I'm pretty sure that this is pretty much a symptom of #128, so we should really fix that at some point - might be enough to close this to since then at least people would get a warning. In the meantime thought I'd put this up as a signpost for anyone running into similar issues.