pysat / pysatCDAAC

pysat support for CDAAC instruments
BSD 3-Clause "New" or "Revised" License
2 stars 2 forks source link

BUG: Potential File Listing issue #24

Closed rstoneback closed 3 years ago

rstoneback commented 3 years ago

@jklenzing reported that files downloaded for 2018 are reported in 2019. https://github.com/pysat/pysatCDAAC/pull/18#issuecomment-860693537

Testing by @rstoneback was not able to reproduce the issue.

Python 3.8.2 | packaged by conda-forge | (default, Apr 24 2020, 07:56:27) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.24.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import pysat

In [2]: gps = pysat.Instrument('cosmic', 'gps', 'ionprf', update_files=True)

In [3]: gps.files.files
Out[3]: Series([], dtype: object)

In [4]: import datetime as dt

In [5]: gps.download(dt.datetime(2018, 12, 30), dt.datetime(2019, 1, 4))

In [6]: gps.files.files
Out[6]: 
2018-12-30 00:04:00.062799872    2018/364/ionPrf_C006.2018.364.00.04.G28_2016.1...
2018-12-30 00:14:00.061700096    2018/364/ionPrf_C006.2018.364.00.14.G17_2016.1...
2018-12-30 00:20:00.062200064    2018/364/ionPrf_C006.2018.364.00.20.G22_2016.1...
2018-12-30 00:22:00.060100096    2018/364/ionPrf_C006.2018.364.00.22.G01_2016.1...
2018-12-30 00:33:00.061400064    2018/364/ionPrf_C006.2018.364.00.33.G14_2016.1...
                                                       ...                        
2019-01-04 23:05:00.062200064    2019/004/ionPrf_C006.2019.004.23.05.G22_2016.1...
2019-01-04 23:23:00.063200000    2019/004/ionPrf_C006.2019.004.23.23.G32_2016.1...
2019-01-04 23:33:00.061000192    2019/004/ionPrf_C006.2019.004.23.33.G10_2016.1...
2019-01-04 23:35:00.063100160    2019/004/ionPrf_C006.2019.004.23.35.G31_2016.1...
2019-01-04 23:41:00.062000128    2019/004/ionPrf_C006.2019.004.23.41.G20_2016.1...
Length: 891, dtype: object

Investigate what is going on with this bug.

jklenzing commented 3 years ago

EDIT: The error seems to be stemming from pysat.utils.time.create_datetime_index. Stepping through the code manually, the index is corrupted after this step, with one year added to some of the data.

jklenzing commented 3 years ago

Following up, I've manually stepped through the code at: https://github.com/pysat/pysatCDAAC/blob/2625b77f8d5cdf5db022b746cc1d28dd3ff0cbe4/pysatCDAAC/instruments/cosmic_gps.py#L240-L263

After which all variables look as expected. Running the following line: https://github.com/pysat/pysatCDAAC/blob/2625b77f8d5cdf5db022b746cc1d28dd3ff0cbe4/pysatCDAAC/instruments/cosmic_gps.py#L265-L267

I get bad index values, with some data from 2009 interpreted as 2014, 2019 data as 2020, etc. This is running for around 36k files, including dates in 2008, 2009, 2014, 2018, 2019. Passing the last few values through as

        index = pysat.utils.time.create_datetime_index(year=year[-10:],
                                                       day=day[-10:],
                                                       uts=uts[-10:])

yields correct index values. These points all error when all 36k values are passed through. I'm thinking this may be an issue with pysat if too many values are passed through this function.

jklenzing commented 3 years ago

Further testing report: It's not the size of the number of points, but if the files are not in sequential order. Test sequence:

year = np.array([2014, 2014, 2008])
day = np.array([1, 2, 365])
uts = np.array([0, 3600, 2700])
pysat.utils.time.create_datetime_index(year=year, day=day, uts=uts)

yields

DatetimeIndex(['2014-01-01 00:00:00', '2014-01-02 01:00:00',
               '2014-12-31 00:45:00'],
              dtype='datetime64[ns]', freq=None)

Since my files were downloaded out of order, this tripped up the code.

jklenzing commented 3 years ago

Deleting and redownloading data in sequential order (a few days from 2008, 2009, 2014), I still see this issue. I'm not certain why the files are being retrieved in a seemingly random order, but this is what is breaking everything downstream. It does not correspond to file creation date or download order.

jklenzing commented 3 years ago

Further testing against pysat/pysat#907 indicates that those bug fixes solve this problem here in both develop and xarray_support.

jklenzing commented 3 years ago

Closing with merge of pysat/pysat#907