asdf-format / asdf

ASDF (Advanced Scientific Data Format) is a next generation interchange format for scientific data
http://asdf.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
523 stars 57 forks source link

fix issue where roundtripping a masked array strips the mask if no values are masked #1803

Closed zacharyburnett closed 2 months ago

zacharyburnett commented 2 months ago

Description

resolves #1768

Checklist:

braingram commented 2 months ago

Would you add a changelog for this that documents the fixed bug?

Also, would you either move the test near: https://github.com/asdf-format/asdf/blob/main/asdf/_tests/tags/core/tests/test_ndarray.py#L381 or update that test to the nicely parametrized one added in this PR?

braingram commented 2 months ago

There's one more issue and I'm a little scared now (not from this PR but from what follows):

import numpy as np
arr = np.arange(5)
m0 = np.ma.masked_array(arr, True)
m1 = np.ma.masked_array(arr, False)
np.testing.assert_array_equal(m0, m11)

The last assert does not fail :-/ so apparently assert_array_equal does not consider the mask. The same is true for assert_equal. Additionally (and this part is asdf's fault), assert_tree_match adds an extra oddity: https://github.com/asdf-format/asdf/blob/33e7095df0c58705043de8f75497dfd64d04bbfb/asdf/_tests/_helpers.py#L55 which drops the mask.

Would you update the test to not use assert_tree_match and apparently (maybe any) of the numpy testing routines. I'll open an issue to check our test suite for other places where the mask is being ignored.

braingram commented 2 months ago

To add to the confusion, the test is failing because:

>> (m0 == m1).all()
masked

when m0 and m1 are both fully masked and matching values. So I think we need to compare m0.mask and m0.data separately.