pysat / pysatNASA

pysat support for NASA Instruments
BSD 3-Clause "New" or "Revised" License
21 stars 7 forks source link

BUG: meta not transferred for ACE data #186

Closed jklenzing closed 1 year ago

jklenzing commented 1 year ago

Describe the bug When loading ACE data, a number of metadata parameters are dropped. For swepam_l2,

/Users/jklenzin/miniconda3/lib/python3.10/site-packages/pysat/_meta.py:464: UserWarning: Metadata with type <class 'float'> does not match expected type <class 'numpy.int32'>. Dropping input for 'Time_PB5' with key 'fill'
  warnings.warn(''.join((
/Users/jklenzin/miniconda3/lib/python3.10/site-packages/pysat/_meta.py:464: UserWarning: Metadata with type <class 'float'> does not match expected type <class 'numpy.int32'>. Dropping input for 'format_time' with key 'fill'
  warnings.warn(''.join((
/Users/jklenzin/miniconda3/lib/python3.10/site-packages/pysat/_meta.py:464: UserWarning: Metadata with type <class 'float'> does not match expected type <class 'numpy.int32'>. Dropping input for 'Np' with key 'fill'
  warnings.warn(''.join((
/Users/jklenzin/miniconda3/lib/python3.10/site-packages/pysat/_meta.py:464: UserWarning: Metadata with type <class 'float'> does not match expected type <class 'numpy.int32'>. Dropping input for 'Vp' with key 'fill'
  warnings.warn(''.join((
/Users/jklenzin/miniconda3/lib/python3.10/site-packages/pysat/_meta.py:464: UserWarning: Metadata with type <class 'float'> does not match expected type <class 'numpy.int32'>. Dropping input for 'He_ratio' with key 'fill'
  warnings.warn(''.join((
/Users/jklenzin/miniconda3/lib/python3.10/site-packages/pysat/_meta.py:464: UserWarning: Metadata with type <class 'float'> does not match expected type <class 'numpy.int32'>. Dropping input for 'Tpr' with key 'fill'
  warnings.warn(''.join((
/Users/jklenzin/miniconda3/lib/python3.10/site-packages/pysat/_meta.py:464: UserWarning: Metadata with type <class 'float'> does not match expected type <class 'numpy.int32'>. Dropping input for 'Weight' with key 'fill'
  warnings.warn(''.join((
/Users/jklenzin/miniconda3/lib/python3.10/site-packages/pysat/_meta.py:464: UserWarning: Metadata with type <class 'float'> does not match expected type <class 'numpy.int32'>. Dropping input for 'unit_time' with key 'fill'
  warnings.warn(''.join((
/Users/jklenzin/miniconda3/lib/python3.10/site-packages/pysat/_meta.py:464: UserWarning: Metadata with type <class 'float'> does not match expected type <class 'numpy.int32'>. Dropping input for 'label_time' with key 'fill'
  warnings.warn(''.join((

Similar warnings are thrown for all ACE instruments here, but not other xarray cdf instruments.

To Reproduce Compare the values of

from cdflib.xarray import cdf_to_xarray
cd ~/data/pysat/ace/swepam_l2/key/1hr
fname = "ac_k1_swe_20200101_v02.cdf"
a = cdf_to_xarray(fname)
a['Np'].attrs

and

import pysat
swepam = pysat.Instrument('ace', 'swepam_l2', inst_id='1hr', tag='key')
swepam.load(2020, 1)
swepam.meta['Np']

cdflib produces:

{'FIELDNAM': 'Proton No. density',
 'VALIDMIN': array([0.], dtype=float32),
 'VALIDMAX': array([1000.], dtype=float32),
 'SCALEMIN': array([0.], dtype=float32),
 'SCALEMAX': array([200.], dtype=float32),
 'LABLAXIS': 'SW H Num Density',
 'UNITS': '#/cc',
 'VAR_TYPE': 'data',
 'FORMAT': 'F6.2',
 'FILLVAL': array([-1.e+31], dtype=float32),
 'DEPEND_0': 'Epoch',
 'DICT_KEY': 'density>',
 'CATDESC': 'Solar Wind Proton Number Density, scalar',
 'AVG_TYPE': ' ',
 'DISPLAY_TYPE': 'time_series',
 'VAR_NOTES': 'Np is the proton number density in units of cm-3, as calculated by integrating the ion distribution function. ',
 'standard_name': 'Proton No. density',
 'long_name': 'SW H Num Density',
 'units': '#/cc'}

pysat produces

Var_Notes                                                         
CatDesc                                                        NaN
ValidMin                                                        -1
ValidMax                                                        -1
FillVal                                                         -1
FIELDNAM                                        Proton No. density
DICT_KEY                                                  density>
desc                      Solar Wind Proton Number Density, scalar
standard_name                                   Proton No. density
value_min0                                                    -1.0
value_min1                                                    -1.0
value_min2                                                    -1.0
value_max0                                                    -1.0
value_max1                                                    -1.0
value_max2                                                    -1.0
SCALEMIN0                                                     -1.0
SCALEMIN1                                                     -1.0
SCALEMIN2                                                     -1.0
SCALEMAX0                                                     -1.0
SCALEMAX1                                                     -1.0
SCALEMAX2                                                     -1.0
FORM_PTR                                                          
LABL_PTR_1                                                        
UNIT_PTR                                                          
fill                                                 -2147483648.0
SCALETYP                                                          
value_min                                                      0.0
value_max                                                   1000.0
SCALEMIN                                                       0.0
SCALEMAX                                                     200.0
plot_label                                        SW H Num Density
AVG_TYPE                                                          
notes            Np is the proton number density in units of cm...
units                                                         #/cc
long_name                                         SW H Num Density
children                                                      None

Expected behavior Meta data should be transferred, not dropped.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Additional context value_min1, etc are part of another issue (though potentially linked), to be added to pysat. Some variables here have multiple additional value_min, value_max, scaletyp, etc. Because pysat meta is stored as a dataframe, these get automatically initiated to default values for all variables.

jklenzing commented 1 year ago

Note that the fix we implemented for similar issues elsewhere is already in the load_xarray routine:

https://github.com/pysat/pysatNASA/blob/9e0930c9d79caf0f7a0ec91b36cdd8ff91e645dc/pysatNASA/instruments/methods/cdaweb.py#L277-L281

jklenzing commented 1 year ago

@JonathonMSmith, if this is a straightforward fix, let's try to get this into 0.0.5. I'm worried that this may require fixes at pysat though, in which case we'll delay and transfer the issue if appropriate.

JonathonMSmith commented 1 year ago

If I run this code:

import pysat swepam = pysat.Instrument('ace', 'swepam_l2', inst_id='1hr', tag='key') swepam.load(2020, 1)

The variables "Time_PB5" and "Weight" are empty. This happens if I use the cdflib xarray interface, or vanilla cdflib.

These values are not in cdaweb, so It's fine that there's no data, but srange to me that they're even populated to begin with.

JonathonMSmith commented 1 year ago

I'd also like to note, that ace won't load at all if "pandas_format" is true With an error from cdflib: ValueError: No records found for variable Time_PB5

I don't know if all of this is directly related, but I'm finding these quirks and want to note them

JonathonMSmith commented 1 year ago

I don't know how to fix this, because as soon as you add a variable with more meta than the others, they all get this extra metadata added on. I see a couple of options.

  1. Drop any variable like 'Time_PB5' that contains no data (dim_empty)
  2. Add a new kwarg to the load routine called "drop_vars" similar to "drop_meta_labels" and use that to drop offending variables on a dataset-by-dataset basis
jklenzing commented 1 year ago

I think the extra meta issue will have to be solved by pysat. Looking over the max / min / fill, value_min and value_max are set correctly, but not fill. I think if we can sort that out we're good on this package.

JonathonMSmith commented 1 year ago

Ohh, I think I completely misunderstood this issue

jklenzing commented 1 year ago

If I change 'fill_val' to 'fill' on Line 281 above, this fixes the problem. I've found an even deeper issue: the multi-dimensions are for storing time values (year, day, something?). Probably need to drop these and convert to pandas.

jklenzing commented 1 year ago

I've got a fix set up. running the full suite of tests now.

jklenzing commented 1 year ago

Ohh, I think I completely misunderstood this issue

That's probably expected the way I wrote it up. There are 2 issues buried up there.