SciQLop / speasy

Space Physics made EASY! A simple Python package to deal with main Space Physics WebServices (CDA,SSC,AMDA,..)
Other
24 stars 7 forks source link

duplicated date with CDA provider not AMDA #130

Open nicolasaunai opened 2 months ago

nicolasaunai commented 2 months ago

Description

Last date of MMS1 FPI FAST data seems erroneous on Sept. 23 2018, but only when getting the data from CDA provider, with AMDA it's fine.

What I Did

from datetime import datetime
import speasy as spz

print("CDA")
t1, t2 = datetime(2018, 9, 23, 23, 59, 37), datetime(2018, 9, 24, 0, 0, 13)
NF_cda = spz.get_data(spz.inventories.data_tree.cda.MMS.MMS1.DIS.MMS1_FPI_FAST_L2_DIS_MOMS.mms1_dis_numberdensity_fast, t1, t2)
V_cda = spz.get_data(spz.inventories.data_tree.cda.MMS.MMS1.DIS.MMS1_FPI_FAST_L2_DIS_MOMS.mms1_dis_bulkv_gse_fast, t1, t2)
T_cda = spz.get_data(spz.inventories.data_tree.cda.MMS.MMS1.DIS.MMS1_FPI_FAST_L2_DIS_MOMS.mms1_dis_tempperp_fast, t1, t2)
print("N")
print(NF_cda.time)
print("V")
print(V_cda.time)
print("T")
print(T_cda.time)

print("AMDA")
NF_amda = spz.get_data(spz.inventories.data_tree.amda.Parameters.MMS.MMS1.FPI.fast_mode.mms1_fpi_dismoms.mms1_dis_ni, t1, t2)
V_amda = spz.get_data(spz.inventories.data_tree.amda.Parameters.MMS.MMS1.FPI.fast_mode.mms1_fpi_dismoms.mms1_dis_vgse, t1, t2)
T_amda = spz.get_data(spz.inventories.data_tree.amda.Parameters.MMS.MMS1.FPI.fast_mode.mms1_fpi_dismoms.mms1_dis_tpara, t1, t2)
print("N")
print(NF_amda.time)
print("V")
print(V_amda.time)
print("T")
print(T_amda.time)

this gives:

CDA
N
['2018-09-23T23:59:37.643764000' '2018-09-23T23:59:42.143794000'
 '2018-09-23T23:59:46.643819000' '2018-09-23T23:59:51.143848000'
 '2018-09-23T23:59:55.643872000' '2018-09-23T22:04:57.103826000'
 '2018-09-24T00:00:00.143903000' '2018-09-24T00:00:04.643927000'
 '2018-09-24T00:00:09.143958000']
V
['2018-09-23T23:59:37.643764000' '2018-09-23T23:59:42.143794000'
 '2018-09-23T23:59:46.643819000' '2018-09-23T23:59:51.143848000'
 '2018-09-23T23:59:55.643872000' '2018-09-23T22:04:57.103826000'
 '2018-09-24T00:00:00.143903000' '2018-09-24T00:00:04.643927000'
 '2018-09-24T00:00:09.143958000']
T
['2018-09-23T23:59:37.643764000' '2018-09-23T23:59:42.143794000'
 '2018-09-23T23:59:46.643819000' '2018-09-23T23:59:51.143848000'
 '2018-09-23T23:59:55.643872000' '2018-09-23T22:04:57.103826000'
 '2018-09-24T00:00:00.143903000' '2018-09-24T00:00:04.643927000'
 '2018-09-24T00:00:09.143958000']
AMDA
N
['2018-09-23T23:59:37.643000000' '2018-09-23T23:59:42.143000000'
 '2018-09-23T23:59:46.643000000' '2018-09-23T23:59:51.143000000'
 '2018-09-23T23:59:55.643000000' '2018-09-24T00:00:00.143000000'
 '2018-09-24T00:00:04.643000000' '2018-09-24T00:00:09.143000000']
V
['2018-09-23T23:59:37.643000000' '2018-09-23T23:59:42.143000000'
 '2018-09-23T23:59:46.643000000' '2018-09-23T23:59:51.143000000'
 '2018-09-23T23:59:55.643000000' '2018-09-24T00:00:00.143000000'
 '2018-09-24T00:00:04.643000000' '2018-09-24T00:00:09.143000000']
T
['2018-09-23T23:59:37.643000000' '2018-09-23T23:59:42.143000000'
 '2018-09-23T23:59:46.643000000' '2018-09-23T23:59:51.143000000'
 '2018-09-23T23:59:55.643000000' '2018-09-24T00:00:00.143000000'
 '2018-09-24T00:00:04.643000000' '2018-09-24T00:00:09.143000000']

so, on CDA, the last date of Sept. 23 is '2018-09-23T22:04:57.103826000' which is clearly wrong. On AMDA it's '2018-09-23T23:59:51.143000000', as the date before the last in CDA.

We think it's somehow related to sprays because we have read the CDF files directly from CDA and could not find the erroneous time.

We have downloaded three CDF files:

The idea was to test whether the spurious date is the last date of the first file, the first of the second or only in the merged version.... but the spurious date is in none of the files...

here is the test:

import pycdfpp
cdf_23 = pycdfpp.load("mms1_fpi_fast_l2_dis-moms_20180923220000_v3.3.0.cdf")
cdf_24 = pycdfpp.load("mms1_fpi_fast_l2_dis-moms_20180924000000_v3.3.0.cdf")
cdf_manuel = pycdfpp.load("mms1_fpis_fast_l2_dis-moms_20180923235901_20180924000009.cdf")
print("Dernier temps du 23/09")
print(pycdfpp.to_datetime64(cdf_23["Epoch"].values[-1]))
print("Premier temps du 24/09")
print(pycdfpp.to_datetime64(cdf_24["Epoch"].values[0]))
print("Temps entre 23/09 23:59 et 00:10 du cdf mergé par le web service cda")
print(pycdfpp.to_datetime64(cdf_manuel["Epoch"].values))
Dernier temps du 23/09
2018-09-23T23:59:55.643872000
Premier temps du 24/09
2018-09-24T00:00:00.143903000
Temps entre 23/09 23:59 et 00:10 du cdf mergé par le web service cda
['2018-09-23T23:59:01.643546000' '2018-09-23T23:59:06.143576000'
 '2018-09-23T23:59:10.643601000' '2018-09-23T23:59:15.143631000'
 '2018-09-23T23:59:19.643656000' '2018-09-23T23:59:24.143686000'
 '2018-09-23T23:59:28.643709000' '2018-09-23T23:59:33.143740000'
 '2018-09-23T23:59:37.643764000' '2018-09-23T23:59:42.143794000'
 '2018-09-23T23:59:46.643819000' '2018-09-23T23:59:51.143848000'
 '2018-09-23T23:59:55.643872000' '2018-09-24T00:00:00.143903000'
 '2018-09-24T00:00:04.643927000' '2018-09-24T00:00:09.143958000']

Bonus question, (maybe more for @brenard-irap ?) any reason why the dates are rounded differently from AMDA than from CDA?

jeandet commented 2 months ago

Can you do the same with disable_cache=True, disable_proxy=True to check ensure that the bug not on our proxy.

Martin-ORIA commented 2 months ago

Hello, I am an Intern working with Nicolas. I tried with disable_cache=True, disable_proxy=True . But I am getting the following error message : MaxRetryError: HTTPSConnectionPool(host='cdaweb.gsfc.nasa.gov', port=443): Max retries exceeded with url: /WS/cdasr/1/dataviews/sp_phys/datasets/MMS1_FPI_BRST_L2_DIS-MOMS/data/20151114T025653Z,20151114T030200Z/mms1_dis_bulkv_gse_brst?format=cdf (Caused by ResponseError('too many 503 error responses))

However when importing the speasy library I also get this warning (it could be linked) : Can't get data from proxy server `http://sciqlop.lpp.polytechnique.fr/cache`