Closed jameswilburlewis closed 1 month ago
There is Akebono PWS data available at https://data.darts.isas.jaxa.jp/pub/akebono/pws , but no orbit or RDM data yet that I can see.
The DARTS admins pointed me at a search interface here: https://www.darts.isas.jaxa.jp/stp/akebono/data.html , but I don't see a way to access individual files, only zip archives of data meeting the search criteria. I guess we can work with it if we have to, but hopefully there are still URLs to the individual files.
I think this might be where the original data products have moved to:
https://darts.isas.jaxa.jp/app/stp/data/exosd/
RDM data at this location appears to be working as before. Orbit data is producing errors:
File "/Users/jwl/PycharmProjects/pyspedas/venv/lib/python3.9/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 93, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "parsers.pyx", line 579, in pandas._libs.parsers.TextReader.__cinit__
File "parsers.pyx", line 668, in pandas._libs.parsers.TextReader._get_header
File "parsers.pyx", line 879, in pandas._libs.parsers.TextReader._tokenize_rows
File "parsers.pyx", line 890, in pandas._libs.parsers.TextReader._check_tokenize_status
File "parsers.pyx", line 2050, in pandas._libs.parsers.raise_parser_error
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
The issue seems to be that the server is delivering text files, with a .txt suffix, .but they are gzip encoded and need a gunzip step in order to be loaded. (In a browser, this seems to happen automatically. I wonder if there's an option we can pass to the requests library via spd_download() to take care of it?)
Kludgey solution for orbit and rdm processing:
Add a gz flag to load_csv_file, if True, specify compression='gzip' when reading the CSV with pandas
In orb_postprocessing and rdm_postprocessing:
try:
data = load_csv_file(files, cols=cols)
except UnicodeDecodeError:
data = load_csv_file(files, cols=cols, gz=True)
I suppose this could fail if there were a mixture of old (uncompressed) and new (gzip compressed) files in the data directory.
Non-kludgey solutions might be to try to detect if a file is gzip-compressed before passing it to pandas. Or keep track of the encoding/compression in spd_download and uncompress if necessary, or negotiate with the server to deliver uncompressed data...?
Our akebono tests are failing because they can no longer download data. We were getting it from http://darts.isas.jaxa.jp/stp/data/exosd/ , but that URL now attempts (unsuccessully!) to redirect to the home page.
I can see some akebono data here:
https://data.darts.isas.jaxa.jp/pub/akebono/
but it only appears to have data for the pws instrument, and not rdm or orb which we previously had access to.
There is a notice on their front page https://darts.isas.jaxa.jp/ :
August 2024
Due to changes to the website configuration and maintenance of the data publishing path, paths will change and some apps will be unavailable. We apologize for the inconvenience. [Maintenance] [Period] 2024-08-20 12:00 -- 2024-08-23 12:00 (JST)
So perhaps things are still being moved. The akebono tests are disabled for now.
If this situation persists, it would be good to get in touch and see if rdm and orb data will still be available.