Closed tompollard closed 1 month ago
Thanks for working on this!
filebytes = filebytes.astype(np.int64)
Here, the original filebytes
is an array of uint8. You're copying and converting them into an array of int64, which means multiplying memory usage by a factor of 9.
So I think this would probably work, but I think (since we're looping over the bytes in Python anyway) it'd be better to do the conversion one element at a time as needed.
That is to say, changing something like filebytes[bpi, 1] >> 2
to either int(filebytes[bpi, 1]) >> 2
if what we really want is a python integer, or np.int64(filebytes[bpi, 1]) >> 2
if what we want is a numpy integer.
Thanks! Like this? I don't see any particular reason to use int64 on the while loop, so went for int.
Thanks. Unless I missed something, though, it looks like there's still an issue in line 2239.
In line 2241, I think we want np.int64(filebytes[bpi, 1]) & 3
instead of np.int64(filebytes[bpi, 1] & 3)
.
In addition to the ones you fixed, there's one more warning in line 2327 (aux_notelen & 1
).
I'm not sure why not all of the warnings were showing up before.
In line 2241, I think we want np.int64(filebytes[bpi, 1]) & 3 instead of np.int64(filebytes[bpi, 1] & 3).
Sorry, good catch.
In addition to the ones you fixed, there's one more warning in line 2327 (aux_notelen & 1).
Odd, I see this now too:
tests/test_annotation.py: 10 warnings
tests/test_plot.py: 2 warnings
/Users/tompollard/projects/wfdb-python/wfdb/io/annotation.py:2327: UserWarning: result dtype changed due to the removal of value-based promotion from NumPy. Changed from int64 to uint8.
if aux_notelen & 1:
I have cast to int64
In line 2241, I think we want np.int64(filebytes[bpi, 1]) & 3 instead of np.int64(filebytes[bpi, 1] & 3).
Adding just np.int64(filebytes[bpi, 1]) & 3
causes the multiplication to precede the bitwise & operator, so I added an extra set of parentheses to maintain the order.
I have cast to int64
Sorry, this (e7290629) does not fix it. "aux_notelen" is causing the warning. "aux_notebytes" is not a problem and there is no reason to change it.
There is also something strange happening involving pandas in rdann
$ NPY_PROMOTION_STATE=legacy ./ve.np2/bin/python -c 'import wfdb, numpy, pandas; print(numpy.__version__, pandas.__version__); wfdb.rdann("sample-data/1003", "atr")'
2.1.2 2.2.3
$ NPY_PROMOTION_STATE=weak ./ve.np2/bin/python -c 'import wfdb, numpy, pandas; print(numpy.__version__, pandas.__version__); wfdb.rdann("sample-data/1003", "atr")'
2.1.2 2.2.3
$ NPY_PROMOTION_STATE=legacy ./ve.np1/bin/python -c 'import wfdb, numpy, pandas; print(numpy.__version__, pandas.__version__); wfdb.rdann("sample-data/1003", "atr")'
1.26.4 2.2.3
$ NPY_PROMOTION_STATE=weak ./ve.np1/bin/python -c 'import wfdb, numpy, pandas; print(numpy.__version__, pandas.__version__); wfdb.rdann("sample-data/1003", "atr")'
1.26.4 2.2.3
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/wfdb-python/wfdb/io/annotation.py", line 2019, in rdann
annotation.set_label_elements(return_label_elements)
File "/tmp/wfdb-python/wfdb/io/annotation.py", line 1405, in set_label_elements
self.convert_label_attribute(contained_elements[0], e)
File "/tmp/wfdb-python/wfdb/io/annotation.py", line 1468, in convert_label_attribute
label_map = self.create_label_map(inplace=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/wfdb-python/wfdb/io/annotation.py", line 908, in create_label_map
label_map.loc[i] = self.custom_labels.loc[i]
~~~~~~~~~~~~~^^^
File "/tmp/wfdb-python/ve.np1/lib/python3.11/site-packages/pandas/core/indexing.py", line 911, in __setitem__
iloc._setitem_with_indexer(indexer, value, self.name)
File "/tmp/wfdb-python/ve.np1/lib/python3.11/site-packages/pandas/core/indexing.py", line 1932, in _setitem_with_indexer
self._setitem_with_indexer_missing(indexer, value)
File "/tmp/wfdb-python/ve.np1/lib/python3.11/site-packages/pandas/core/indexing.py", line 2328, in _setitem_with_indexer_missing
self.obj._mgr = self.obj._append(value)._mgr
^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/wfdb-python/ve.np1/lib/python3.11/site-packages/pandas/core/frame.py", line 10554, in _append
other = row_df.infer_objects(copy=False).rename_axis(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/wfdb-python/ve.np1/lib/python3.11/site-packages/pandas/core/generic.py", line 6888, in infer_objects
new_mgr = self._mgr.convert(copy=copy)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/wfdb-python/ve.np1/lib/python3.11/site-packages/pandas/core/internals/managers.py", line 447, in convert
return self.apply("convert", copy=copy, using_cow=using_copy_on_write())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/wfdb-python/ve.np1/lib/python3.11/site-packages/pandas/core/internals/managers.py", line 363, in apply
applied = getattr(b, f)(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/wfdb-python/ve.np1/lib/python3.11/site-packages/pandas/core/internals/blocks.py", line 639, in convert
blocks = self.split_and_operate(
^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/wfdb-python/ve.np1/lib/python3.11/site-packages/pandas/core/internals/blocks.py", line 471, in split_and_operate
rbs = func(nb, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/wfdb-python/ve.np1/lib/python3.11/site-packages/pandas/core/internals/blocks.py", line 655, in convert
res_values = lib.maybe_convert_objects(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "lib.pyx", line 2602, in pandas._libs.lib.maybe_convert_objects
OverflowError: Python int too large to convert to C long
Looks like a straight up pandas bug:
$ NPY_PROMOTION_STATE=weak ./ve.np1/bin/python -c 'import numpy,pandas; print(numpy.__version__, pandas.__version__); print(pandas.DataFrame({"x":[1]}))'
1.26.4 2.2.3
...
OverflowError: Python int too large to convert to C long
Thanks Benjamin. Interested in seeing what happens with the Pandas issue!
As discussed in https://github.com/MIT-LCP/wfdb-python/issues/493, numpy v2.0 introduced changes to type promotion rules: https://numpy.org/devdocs/numpy_2_0_migration_guide.html#changes-to-numpy-data-type-promotion
Running pytest with
numpy==2.0.2
andNPY_PROMOTION_STATE=weak_and_warn
raises the following warnings for wfdb/io/annotation.py:The changes in this pull request address these issues by explicitly casting the type. I plan to follow up with several additional fixes to other modules.