Closed neutrinoceros closed 1 week ago
I'm puzzled. I don't understand how it's possible that the one of the test is still failing after bumping.
Is it possible there's a NaN? I'm not sure how NaNs would impact the typical equality/hash comparisons.
Good thinking ! indeed, hashing an array isn't a stable operation if it contains a NaN
In [8]: a = np.arange(10, dtype="float")
In [9]: a
Out[9]: array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
In [10]: a[0] = np.nan
In [11]: hash(tuple(a))
Out[11]: -5529612596851953840
In [12]: hash(tuple(a))
Out[12]: -5371981722103082231
In [13]: hash(tuple(a))
Out[13]: -7440010721638526525
In [14]: hash(tuple(a))
Out[14]: -4675353377044086749
In [15]: hash(tuple(a))
Out[15]: 1634418349654970603
In [16]: hash(tuple(a))
Out[16]: 5494725379304409846
In [17]: a = np.arange(10, dtype="float")
In [18]: hash(tuple(a))
Out[18]: -2040549277248155741
In [19]: hash(tuple(a))
Out[19]: -2040549277248155741
In [20]: hash(tuple(a))
Out[20]: -2040549277248155741
In [21]: hash(tuple(a))
Out[21]: -2040549277248155741
In [22]: hash(tuple(a))
Out[22]: -2040549277248155741
In [23]: hash(tuple(a))
Out[23]: -2040549277248155741
(I should point out that this little experiment was conducted with numpy 1.26) So maybe what's happening is that we have some NaN in an output field that wasn't present when running with numpy 1.x π€ Could be a bug on our side or a in NumPy (or both ?? π) @chrishavlin, any chance you still have whatever scripts you used to decipher this situation a couple days ago ?
I really need to learn more about NaNs. I had no idea there was, officially, a 'payload': https://en.wikipedia.org/wiki/NaN
oh interesting, it's not even the same tests that are failing -- still enzo, but the velocity_divergence
field rather than the velocity_magnitude
. Likely a very similar issue though, I'll take a closer look this morning.
Awesome, thank you very much. Feel free to push to this branch if you need to !
well I'm thoroughly confused. Manually comparing values didn't show any difference in grid values for ('gas', 'velocity_divergence')
between np 2 and np 1.26.4. Then I spent a while getting nose working with py 3.10 to match what Jenkins is doing.... and I reproduced the error with just np2:
in an environment with np2, store the answer tests for enzo:
nosetests --answer-store --local --with-answer-testing --local-dir /Users/chavlin/data/yt_answer_store/enzo_test yt/frontends/enzo/tests/test_outputs.py --answer-name=local-ezno yt.frontends.enzo --answer-big-data --verbose
and then immediately run a comparison in the same np2 environment
nosetests --local --with-answer-testing --local-dir /Users/chavlin/data/yt_answer_store/enzo_test yt/frontends/enzo/tests/test_outputs.py --answer-name=local-ezno yt.frontends.enzo --answer-big-data --verbose
and that test fails for me:
======================================================================
FAIL: GridValues_galaxy0030_all_('gas', 'velocity_divergence')
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/chavlin/.pyenv/versions/3.10.12/envs/yt_np2_delete/lib/python3.10/site-packages/nose/case.py", line 198, in runTest
self.test(*self.arg)
File "/Users/chavlin/src/yt_/yt_dev/yt/yt/utilities/answer_testing/framework.py", line 418, in __call__
self.compare(nv, ov)
File "/Users/chavlin/src/yt_/yt_dev/yt/yt/utilities/answer_testing/framework.py", line 700, in compare
assert_equal(new_result[k], old_result[k])
File "/Users/chavlin/.pyenv/versions/3.10.12/envs/yt_np2_delete/lib/python3.10/site-packages/numpy/testing/_private/utils.py", line 414, in assert_equal
raise AssertionError(msg)
AssertionError:
Items are not equal:
ACTUAL: 'ec255050b61b671af5425c0e4ae7a17d'
DESIRED: 'f16afe8932b49e9b2acadf304ae3a168'
But this is only happening within nose -- I made a little manual answer test and couldn't reproduce the failure (script is here). Not even sure where to look now...
EDIT: on my desktop (Ubuntu), I don't have the above issue -- storing the enzo tests then immediately re-running passes as expected.
Interesting that you get
ACTUAL: 'ec255050b61b671af5425c0e4ae7a17d'
DESIRED: 'f16afe8932b49e9b2acadf304ae3a168'
while the latest attempt on Jenkins has
ACTUAL: 'f16afe8932b49e9b2acadf304ae3a168'
DESIRED: 'ec255050b61b671af5425c0e4ae7a17d'
So the hash isn't stable but it's not completely random either.
On my desktop (Ubuntu), I don't have the above issue -- storing the enzo tests then immediately re-running passes as expected.
Just curious, how many attempts did you make ? Was it one (un)lucky try ?
Maybe there's an unstable ordering issue -- is it possible argsort
could be involved?
wait sorting can be unstable ??? in what sense ?
So I'm speculating. :) What I'm wondering is if argsort
could potentially have an instability. I suspect that I'm way off in the wild on this, though. My concern was what if:
indices = np.argsort(some_array_with_duplicates)
and then sorting by those indices could potentially result in different orderings. But I'm kind of grasping at straws here and I don't think it's the source of the error.
I will note that we've struggled with divergence in the past -- it has been the source of differences (but reliable, not stochastic) as a result of order-of-operations, subtraction of tiny numbers from single precision, etc.
Oh I missed that you were talking about *arg*sort
, which indeed may be unstable, if the fact that it accepts an optional keyword argument kind="stable"
is any indication
oh, and stable
is new in 2.0.0!
Ah, but while I thought we used argsort
in testing, turns out we kinda don't.
$ grep -lr argsort --include="*.py"
yt/visualization/volume_rendering/create_spline.py
yt/visualization/plot_modifications.py
yt/fields/tests/test_fields.py
yt/frontends/tipsy/io.py
yt/frontends/stream/data_structures.py
yt/frontends/stream/definitions.py
yt/frontends/art/io.py
yt/frontends/gadget/io.py
yt/frontends/flash/data_structures.py
yt/data_objects/selection_objects/ray.py
yt/data_objects/level_sets/tests/test_clump_finding.py
yt/data_objects/level_sets/clump_tools.py
yt/data_objects/particle_trajectories.py
yt/geometry/coordinates/cartesian_coordinates.py
yt/utilities/particle_generator.py
yt/utilities/lib/tests/test_geometry_utils.py
yt/utilities/lib/cykdtree/tests/test_utils.py
yt/utilities/flagging_methods.py
oh, and stable is new in 2.0.0!
is it ? the docs says
Changed in version 1.15.0.: The βstableβ option was added.
Right -- kind="stable"
was in 1.15.0, but stable: bool
was added in 2.0.0. (The two are not unrelated though :) )
Ooooow. To think I actually worked on this a couple months back and apparently obliterated it from my memory...
well i don't have an answer but have found some suspicious behavior. I'm going to summarize it all here
Working from a fresh py3.10 environment:
pip install -e .[enzo,test]
For running nose with python 3.10, you need to edit nose a bit. The jenkins build does the following:
find .venv/lib/python3.10/site-packages/nose -name '*.py' -exec sed -i -e s/collections.Callable/collections.abc.Callable/g '{}' ';'
on a mac and with a pyenv-virtualenv called yt_np2_test_again
i used:
find /Users/chavlin/.pyenv/versions/yt_np2_test_again/lib/python3.10/site-packages/nose -name "*.py" -exec sed -i '' s/collections.Callable/collections.abc.Callable/g {} +
In an effort to match more closely what jenkins is doing, I hacked away at tests/nose_runner.py
and tests/tests.yaml
so that I could call python tests/nose_runner.py
and have it just run the suspect enzo test (code on this branch). My initial comment above (this comment) actually has a mistake in how nosetests
is called: it included both a path to the enzo test file and the module specification, so it was running tests twice (turns out I think it was exposing the same or a very similar error though -- see below).
Here's the output of running python tests/nose_runner.py
(the hacked version that will just call the suspect enzo test):
(yt_np2_test_again) chavlin@lism01-chavlin yt % python tests/nose_runner.py
NoseWorker-1: WILL DO self.name = py310_local_enzo_010
GridHierarchy_galaxy0030_all ... ok
ParentageRelationships_galaxy0030_all ... ok
GridValues_galaxy0030_all_('gas', 'velocity_divergence') ... ok
PixelizedProjectionValues_galaxy0030_all_('gas', 'velocity_divergence')_0_None ... ok
PixelizedProjectionValues_galaxy0030_all_('gas', 'velocity_divergence')_0_('gas', 'density') ... ok
PixelizedProjectionValues_galaxy0030_sphere_('max', (0_1, 'unitary'))_('gas', 'velocity_divergence')_0_None ... ok
PixelizedProjectionValues_galaxy0030_sphere_('max', (0_1, 'unitary'))_('gas', 'velocity_divergence')_0_('gas', 'density') ... ok
PixelizedProjectionValues_galaxy0030_all_('gas', 'velocity_divergence')_1_None ... ok
PixelizedProjectionValues_galaxy0030_all_('gas', 'velocity_divergence')_1_('gas', 'density') ... ok
PixelizedProjectionValues_galaxy0030_sphere_('max', (0_1, 'unitary'))_('gas', 'velocity_divergence')_1_None ... ok
PixelizedProjectionValues_galaxy0030_sphere_('max', (0_1, 'unitary'))_('gas', 'velocity_divergence')_1_('gas', 'density') ... ok
PixelizedProjectionValues_galaxy0030_all_('gas', 'velocity_divergence')_2_None ... ok
PixelizedProjectionValues_galaxy0030_all_('gas', 'velocity_divergence')_2_('gas', 'density') ... ok
PixelizedProjectionValues_galaxy0030_sphere_('max', (0_1, 'unitary'))_('gas', 'velocity_divergence')_2_None ... ok
PixelizedProjectionValues_galaxy0030_sphere_('max', (0_1, 'unitary'))_('gas', 'velocity_divergence')_2_('gas', 'density') ... ok
----------------------------------------------------------------------
XML: /Users/chavlin/src/yt_/yt_dev/np2_testing/yt/nosetests.xml
----------------------------------------------------------------------
Ran 15 tests in 58.996s
OK
Updating and precise version information requires
gitpython to be installed.
Try: python -m pip install gitpython
GridHierarchy_galaxy0030_all ... ok
ParentageRelationships_galaxy0030_all ... ok
GridValues_galaxy0030_all_('gas', 'velocity_divergence') ... FAIL
PixelizedProjectionValues_galaxy0030_all_('gas', 'velocity_divergence')_0_None ... ok
PixelizedProjectionValues_galaxy0030_all_('gas', 'velocity_divergence')_0_('gas', 'density') ... ok
PixelizedProjectionValues_galaxy0030_sphere_('max', (0_1, 'unitary'))_('gas', 'velocity_divergence')_0_None ... ok
PixelizedProjectionValues_galaxy0030_sphere_('max', (0_1, 'unitary'))_('gas', 'velocity_divergence')_0_('gas', 'density') ... ok
PixelizedProjectionValues_galaxy0030_all_('gas', 'velocity_divergence')_1_None ... ok
PixelizedProjectionValues_galaxy0030_all_('gas', 'velocity_divergence')_1_('gas', 'density') ... ok
PixelizedProjectionValues_galaxy0030_sphere_('max', (0_1, 'unitary'))_('gas', 'velocity_divergence')_1_None ... ok
PixelizedProjectionValues_galaxy0030_sphere_('max', (0_1, 'unitary'))_('gas', 'velocity_divergence')_1_('gas', 'density') ... ok
PixelizedProjectionValues_galaxy0030_all_('gas', 'velocity_divergence')_2_None ... ok
PixelizedProjectionValues_galaxy0030_all_('gas', 'velocity_divergence')_2_('gas', 'density') ... ok
PixelizedProjectionValues_galaxy0030_sphere_('max', (0_1, 'unitary'))_('gas', 'velocity_divergence')_2_None ... ok
PixelizedProjectionValues_galaxy0030_sphere_('max', (0_1, 'unitary'))_('gas', 'velocity_divergence')_2_('gas', 'density') ... ok
======================================================================
FAIL: GridValues_galaxy0030_all_('gas', 'velocity_divergence')
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/chavlin/.pyenv/versions/yt_np2_test_again/lib/python3.10/site-packages/nose/case.py", line 198, in runTest
self.test(*self.arg)
File "/Users/chavlin/src/yt_/yt_dev/np2_testing/yt/yt/utilities/answer_testing/framework.py", line 418, in __call__
self.compare(nv, ov)
File "/Users/chavlin/src/yt_/yt_dev/np2_testing/yt/yt/utilities/answer_testing/framework.py", line 700, in compare
assert_equal(new_result[k], old_result[k])
File "/Users/chavlin/.pyenv/versions/yt_np2_test_again/lib/python3.10/site-packages/numpy/testing/_private/utils.py", line 414, in assert_equal
raise AssertionError(msg)
AssertionError:
Items are not equal:
ACTUAL: 'f16afe8932b49e9b2acadf304ae3a168'
DESIRED: 'ec255050b61b671af5425c0e4ae7a17d'
----------------------------------------------------------------------
XML: /Users/chavlin/src/yt_/yt_dev/np2_testing/yt/nosetests.xml
----------------------------------------------------------------------
Ran 15 tests in 60.008s
FAILED (failures=1)
NoseWorker-1: Exiting
Initially the answer store does not exist -- it runs the tests and stores them. Then it immediately runs the tests again and fails with the exact same ACTUAL/DESIRED hashes in the failing test here.
If I immediately run tests again, with the answer store already existing, tests pass:
(yt_np2_test_again) chavlin@lism01-chavlin yt % python tests/nose_runner.py
NoseWorker-1: WILL DO self.name = py310_local_enzo_010
Updating and precise version information requires
gitpython to be installed.
Try: python -m pip install gitpython
GridHierarchy_galaxy0030_all ... ok
ParentageRelationships_galaxy0030_all ... ok
GridValues_galaxy0030_all_('gas', 'velocity_divergence') ... ok
PixelizedProjectionValues_galaxy0030_all_('gas', 'velocity_divergence')_0_None ... ok
PixelizedProjectionValues_galaxy0030_all_('gas', 'velocity_divergence')_0_('gas', 'density') ... ok
PixelizedProjectionValues_galaxy0030_sphere_('max', (0_1, 'unitary'))_('gas', 'velocity_divergence')_0_None ... ok
PixelizedProjectionValues_galaxy0030_sphere_('max', (0_1, 'unitary'))_('gas', 'velocity_divergence')_0_('gas', 'density') ... ok
PixelizedProjectionValues_galaxy0030_all_('gas', 'velocity_divergence')_1_None ... ok
PixelizedProjectionValues_galaxy0030_all_('gas', 'velocity_divergence')_1_('gas', 'density') ... ok
PixelizedProjectionValues_galaxy0030_sphere_('max', (0_1, 'unitary'))_('gas', 'velocity_divergence')_1_None ... ok
PixelizedProjectionValues_galaxy0030_sphere_('max', (0_1, 'unitary'))_('gas', 'velocity_divergence')_1_('gas', 'density') ... ok
PixelizedProjectionValues_galaxy0030_all_('gas', 'velocity_divergence')_2_None ... ok
PixelizedProjectionValues_galaxy0030_all_('gas', 'velocity_divergence')_2_('gas', 'density') ... ok
PixelizedProjectionValues_galaxy0030_sphere_('max', (0_1, 'unitary'))_('gas', 'velocity_divergence')_2_None ... ok
PixelizedProjectionValues_galaxy0030_sphere_('max', (0_1, 'unitary'))_('gas', 'velocity_divergence')_2_('gas', 'density') ... ok
----------------------------------------------------------------------
XML: /Users/chavlin/src/yt_/yt_dev/np2_testing/yt/nosetests.xml
----------------------------------------------------------------------
Ran 15 tests in 58.911s
OK
In trying to find the actual cause of the error, I came across a couple of ways to get the test to pass that are likely fixing a symptom and not the cause...
First, test_galaxy0030
instantiates a ds
that gets passed to the big_patch_amr
function:
If I edit that to instead pass the filename of the dataset, then tests pass initially.
Second, the actual calculation of the divergence is very sensitive to small changes in the value of grid-spacing. Here's the relevant code:
If I change those ds
lines to convert to cm
(e.g., ds = div_fac * just_one(data["index", "dx"]).to('cm')
) then tests pass initially.
no update on a fix, but looks like the failure isn't np 2 related. I think this behavior has been around a while and it likely is only being exposed now because of bumping the answers: here's a test script
import yt
import numpy as np
yt.set_log_level(50)
def _get_vals():
ds = yt.load("IsolatedGalaxy/galaxy0030/galaxy0030")
g1 = ds.index.grids[0]
vdiv_1 = g1['gas', 'velocity_divergence']
return vdiv_1
if __name__ == "__main__":
print(np.version.version)
v1 = _get_vals()
v2 = _get_vals()
print(np.max(np.abs(v1 - v2))) # 4.930380657631324e-32 1/s
print(np.all(v1==v2)) # False
v3 = _get_vals()
print(np.all(v1==v3)) # False
print(np.all(v2==v3)) # True
The script above behaves the same for np>2 and np<2 (including when building yt with np<2 -- I went back as far as yt-4.1.4).
So, yt 4.1.4 was in January 2023, but #3856[^1] was first released in yt 4.1.0. Maybe that's actually a regression that happened somewhere in between ?
[^1]: last time these answers were bumped successfully
Ok -- kinda figured out why this is happening, but still not certain of the root cause:
So the issue is related to unit registry state. Within the _divergence
calculation:
Those ds = div_fac * just_one(data["index", "dx"])
lines end up with the unit registry of the previous dataset.
Here's a standalone script that illustrates the bug:
import yt
yt.set_log_level(50)
print("load ds 1\n")
ds = yt.load("IsolatedGalaxy/galaxy0030/galaxy0030")
print(f"Initial registry id on load 1: {id(ds.unit_registry)}") # 139759425784224
dx = 1. * ds.index.grids[0]['index', 'dx'][0,0,0]
print(f"dx registry: {id(dx.units.registry)}") # 139759425784224
print("\nload ds 2\n")
ds2 = yt.load("IsolatedGalaxy/galaxy0030/galaxy0030")
print(f"Initial registry id on load 2: {id(ds2.unit_registry)}") # 140563377278064
dx0 = ds2.index.grids[0]['index', 'dx'][0,0,0]
print(f"dx0 registry: {id(dx0.units.registry)}") # 140563377278064
dx = 1. * dx0
print(f"dx registry: {id(dx.units.registry)}") # 139759425784224
for which you get
load ds 1
Initial registry id on load 1: 140180192729504
Parsing Hierarchy : 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 173/173 [00:00<00:00, 23574.99it/s]
dx registry: 140180192729504
load ds 2
Initial registry id on load 2: 140180163540928
Parsing Hierarchy : 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 173/173 [00:00<00:00, 25036.73it/s]
dx0 registry: 140180163540928
dx registry: 140180192729504
That the unit registry of the final dx
variable points to the registry of the previous dataset.
If I comment all the @lru_cache
decorators in unyt/array.py then the that final dx
points to its proper registry and the subsequent hashes in the failing nose tests will match. So for some reason unyt is picking up the registry from the prior dataset. But it's not immediately clear why that would matter: in stepping through the code, I've found that the following spot results in a change in hash despite being a conversion from '1/s' to '1/s':
Adding the following will print out the conversion factor that ends up being used
if "velocity_divergence" in str(field):
print("_generate_fields")
print(id(fd.units.registry))
from unyt.array import _sanitize_units_convert
new_units = _sanitize_units_convert(fi.units, fd.units.registry)
(conv_factor, offset) = fd.units.get_conversion_factor(
new_units, fd.dtype
)
print(f"conversion factor: {conv_factor}")
when you access
import yt
ds = yt.load("IsolatedGalaxy/galaxy0030/galaxy0030")
g1 = ds.index.grids[0]['gas', 'velocity_divergence']
it hits this line twice, the first time through the conversion factor is not exactly 1:
_generate_fields
140347495891200
conversion factor: 1.0000000000000002
the second time through it is exactly one:
_generate_fields
140347495891200
conversion factor: 1.0
and with the unit registry state issue, when you open the dataset again, the initial conversion factor is identically 1.0, so the final hash differs between the two cases.
turns out that you change this line:
to pass in the str repr of the unit:
new_field = data.ds.arr(np.zeros(data[xn].shape, dtype=np.float64), str(f.units))
the unit-registry state issue is fixed (because it forces unyt will pull in the correct registry associated with data.ds
rather than f
?), so maybe we should just do that and be done?
I've found that the following spot results in a change in hash despite being a conversion from '1/s' to '1/s'
If the str '1/s'
is (or more specifically, its id) involved in the hash computation I would expect a change: for short strings that only contain alphanumeric (and _
) characters, Python only stores one copy of the string, but /
isn't part of thes, so two instances of '1/2'
while have different ids.
Not sure that's relevant though.
But your fix seems safe anyway, so let's ! Thanks again for this very impressive inquiry !
If the str '1/s' is (or more specifically, its id) involved in the hash computation I would expect a change: for short strings that only contain alphanumeric (and _) characters, Python only stores one copy of the string, but / isn't part of thes, so two instances of '1/2' while have different ids.
the hashes were just on based the array values though -- there are actual differences in the arrays. Here's a simpler example that illustrates the problem:
import yt
yt.set_log_level(50)
ds = yt.load("IsolatedGalaxy/galaxy0030/galaxy0030")
dx = ds.index.grids[0]['index', 'dx'][0,0,0]
x1 = ds.index.grids[0]['gas', 'relative_velocity_x'][0,0,0]
x2 = x1 / dx
x2c = x2.to(str(x2.units))
print(x2c.units, x2.units, x2c.units == x2.units)
print(x2c - x2)
shows
1/s 1/s True
-5.877471754111438e-39 1/s
in any case, the quick fix is probably the way to go -- i have to hop off right now though, may not get to it before next week.
there are actual differences in the arrays
ooow, I completely missed that. Yes, I think your fix is the way to go, so I'll push it now (with you as the commit author).
just pushed it now. I wasn't sure how to phrase the commit message so feel free to amend it. I'll come back in about an hour to bump the answers again if needed, otherwise I'll just undraft the PR.
Ok Jenkins is back to green, but just to be sure, let's double check that it's really stable : @yt-fido test this please
PR Summary
Numpy 2.0 just dropped on PyPI (π). This change is a pretext to make sure everything still runs smoothly. From the conversation on #4874, I expect I'll need to bump a couple answer tests but everything else I expect to stay green.
Closes #4874