pysat / pysatNASA

pysat support for NASA Instruments
BSD 3-Clause "New" or "Revised" License
21 stars 7 forks source link

BUG: loading cdf data with numpy 1.24 #142

Closed jklenzing closed 1 year ago

jklenzing commented 1 year ago

Describe the bug When data is loaded with the latest numpy, some instruments (de2_lang, iss_fpmu) encounter new errors.

Update: see root cause in comments below.

To Reproduce

import datetime as dt
import pysat

if dt.datetime(1982, 1, 1) not in lang.files.files.index:
    lang.download(dt.datetime(1982, 1, 1))
lang = pysat.Instrument('de2', 'lang', use_cdflib=True)
lang.load(1982, 1)

Under numpy 1.23, this runs fine. With 1.24, I get

File ~/code/core/pysat/pysat/_instrument.py:3178, in Instrument.load(self, yr, doy, end_yr, end_doy, date, end_date, fname, stop_fname, verifyPad, use_header, **kwargs)
   3175     message = ' '.join((message, 'Loaded data is not',
   3176                        'monotonically increasing. '))
   3177 if self.strict_time_flag:
-> 3178     raise ValueError(' '.join((message, 'To continue to use data,'
   3179                                'set inst.strict_time_flag=False',
   3180                                'before loading data')))
   3181 else:
   3182     warnings.warn(message, stacklevel=2)

ValueError:  Loaded data is not unique. To continue to use data,set inst.strict_time_flag=False before loading data

Expected behavior Data should load consistently.

Screenshots n/a

Desktop (please complete the following information):

Also seen for all ubuntu configs in Github Actions

Additional context Note there are additional issues with the pysatCDF implementation. These are documented in https://github.com/pysat/pysatCDF/issues/46

jklenzing commented 1 year ago

Noting that failure also occurs for numpy 1.24.1

jklenzing commented 1 year ago

Reviewing the code, this is not directly caused by the numpy version. Both datasets do not have unique data, and this will break when testing against older numpy versions as well. The primary source of failure is in pysatCDF for numpy<1.24.

Ordinarily in the unit tests, if the loading of a data set fails due to non-unique data, the code will try again with the strict time flag set to False. The error messages above appear as "additional errors occurred" when pysatCDF fails on the first go due to chained indices. Needs fix at https://github.com/pysat/pysatCDF/pull/47

jklenzing commented 1 year ago

Plan forward:

jklenzing commented 1 year ago

Bumping version. Will revisit for the 0.1.0 release.

jklenzing commented 1 year ago

Since the pysatCDF issue is captured at that package, closing this issue.