Closed ErikPGJ closed 1 month ago
Possibly the CDF update should have been a separate issue, but it is related.
@ilona-irf Note that requirements can come from both SOC and ROC. They can be different but must be consistent and I assume that SOC overrides ROC by default.
Just adding a short comment here: The CDF lib included in irfu-matlab (latest devel/MMSdevel branches) is the latest CDF patch for Matlab (v3.9.0) released by NASA SPDF, master branch of irfu-matlab still uses v3.8.1. (Currently I known that NASA do have a v3.9.1 in the works but it is not yet officially released so I have not included it in irfu-matlab).
Note that there is a GitLab account for BICAS specifically at ROC too, in addition to irfu-matlab. It contains issues which should be related to this one (possibly subsets of this one): https://gitlab.obspm.fr/ROC/RCS/BICAS/-/issues/47 https://gitlab.obspm.fr/ROC/RCS/BICAS/-/issues/84 https://gitlab.obspm.fr/ROC/RCS/BICAS/-/issues/85
Issue 47 is from Feb 2021 but is still open. Unclear if it can be closed.
@ilona-irf Also note that the BICAS metadata in descriptor.json
contains a reference to the RCS ICD version which BICAS officially supports. Not sure if the current value is correct as it is, or how much it matters, but
then this variable should be updated too. I doubt that there is anything to do on the actual BICAS interface itself (2) but one never knows (I have not heard anything for a long time, and ROC has not complained while running recent BICAS versions).
The RCS ICD version is set in bicas.const
,
MAP('SWD.identification.icd_version') = '1.4';
which is automatically passed on to descriptor.json
by generating it using bicas.main('--swdescriptor')
.
Note: The RCS ICD covers both (1) the official BICAS interface (the "BICAS API") but it also covers (2) some dataset metadata (consortium-specific conventions?).
@ilona-irf Also, after updating updating dataset skeletons proper, you might need to update the MODS global attribute with information on what was updated for the datasets. MODS is not set in the skeletons but in BICAS via the data structure built in bicas.const.init_GA_MODS_DB()
. It builds a data structure using objects. There is a system in place for how to use functions and pre-defined constants to avoid hardcoded duplication, even if the data structure contains duplicated data (same partial update for multiple datasets).
MODS can not be set in the skeletons since MODS also contains (mostly) information on updates to the processing which can updated independently of the skeletons. Not sure how much pure dumb skeleton information should be mentioned there, but it might, or at least for "big" changes like removing/adding/renaming zVariables.
ROC now wants us to use CDF compression (i.e. compression as part of the CDF format itself). Xavier Bonnin mentions this in the two LESIA GitLab issues mentioned above
NOTE:
CDF variable compression is now possible and encouraged (use VAR_COMPRESSION: GZIP.6 in CDF skeleton)
CDF 3.9.0 will be used to generate RPW science data files
MODS global attribute shall be set as defined in the RCS ICD 1.6
I have not found an explicit mentioning of CDF compression being allowed or disallowed in the documents I would expect (have only looked quickly though):
This should be implemented by adding/setting the relevant flag for the CDF-writing library when it is called by BICAS. Note that CDF compression can be enabled/disabled separately for every zVariable, as well as (I think) for the CDF as a whole.
Skeleton files (.skt) say things like below,
! VAR_COMPRESSION: None
! (Valid compression: None, GZIP.1-9, RLE.0, HUFF.0, AHUFF.0)
but it seems that the valid values should be interpreted as GZIP.1
to GZIP.9
etc., depending on degree of desired compression. It seems "GZIP.1-9" is not a valid value, though skt2cdf.sh
will not give an error for it.
Note that Xavier Bonnin specifically mentions "GZIP.6".
FYI, I have implemented support for zVariable compression (not "entire-file compression, the other CDF compression feature) in CDFs in BICAS. The information (compress/not compress) comes from the skeleton/master file as describe above. I have tested it on one skeleton.
FYI, that update is on SOdevel. BICAS development is always on SOdevel.
32e461134 Erik P G Johansson (2024-07-02 16:13:05 +0200) (HEAD -> SOdevel, origin/SOdevel) irf.cdf.write_dataobj(): Support variable compression
Footnote: I use lists for different categories of dataset IDs which I can then use for automatizing (bash etc.) task relating to datasets, e.g. skeletons.
RODP (=inflight) dataset IDs: RODP_BICAS_dataset_IDs.zip
Note that this includes:
@ilona-irf If you are interested in scripts, then interactive_replace
(bash script for interactive string replacement) and so_find_*
(bash functions defined in init_aliases_functions
; for "globbing" using lists of SolO dataset IDs) are relevant.
I have a copy of my bash scripts (and a small number of python scripts) at brain: /home/erjo/bin/global/
.
7d5fedbbf Erik P G Johansson (2024-07-04 13:23:45 +0200) (HEAD -> SOdevel, origin/SOdevel) Compliance-fix: GAs TIME_MIN, TIME_MAX: Change Julian date-->"ISO"
fixes the TIME_MIN/TIME_MAX issue.
Change "Spaceraft" --> "Spacecraft" ...
Thanks a lot.
[SOdevel 49ff754fd] Update Software_name GA to use identificatior.identifier field from the descriptor (ICD 1.7)
Will add one entry to MODS which should refer to all updates made which are related to this issue, similar to the entry I added in SKELETON_MODS (not commited yet):
13: CDF_CHAR { "V15: Jul 2024 : Update to make compliant with SOL-SGS-TN-0009 i2.6 and ROC-PRO-PIP-ICD-00037-LES i1.7, removal of some unused optional attributes. - I.Benko (IRF)" } .
I've started working on Skeleton version 15 in DataPool (for all our datasets on branch bia_tmp). Finished fixing all GAs for dataset SOLO_L2_RPW-LFR-SURV-SWF-E https://gitlab.obspm.fr/ROC/RCS/BICAS/-/issues/84
Will turn to CDF, validate, test processing and then commit once I'm done finished fixing all the zVars, then repeat for the remaining 5 L2 datasets. Let me know if it's needed for any L3 ones.
(Not noting all the minorities here, will ping once the commit of the final skeleton/master CDF is in ROC's DataPool. Trying to minimize number of commits in DataPool.)
Bigger things about GA directly related to BICAS:
(There were many other updates made in GAs, but) things I'm considering pointing out to Xavier in the original GitLab issue are mistakes which were not detected by validator used to generate Report from SOAR (attached by Xavier in the original GitLab issue):
In the past couple of days, other people made submissions compatible with CDF 3.9, it seems like we should too(?). It also says in the note in original GitLab issue "CDF 3.9.0 will be used to generate RPW science data files" - so it seems they are (kind of) asking for it formally.
@ilona-irf I have updated BICAS in such a way that
bicas.proc.L1L2.cal.rct.RctData
).CAL_*
and Data_version
.
[]
) is used for GAs when not found in the RCT. See the constructor in bicas.proc.L1L2.cal.rct.RctData
.derive_output_dataset_GAs
.Using this it should be easy to modify derive_output_dataset_GAs()
such that science datasets contain the corresponding GAs from RCTs via OutputDataset.RctdCa
.
dbd01e1c2 Erik P G Johansson (2024-07-10 17:51:22 +0200) (HEAD -> SOdevel, origin/SOdevel) RctData: Read CAL_* GAs when present (no error on fail)
This issue should be mostly resolved for L2 by BICAS 8.2.0.
Previously mentioned issues 84 & 85 at LESIA have been closed as of 2024-10-01. Issue 47 (at LESIA; L3) is still open:
https://gitlab.obspm.fr/ROC/RCS/BICAS/-/issues/47 https://gitlab.obspm.fr/ROC/RCS/BICAS/-/issues/84 https://gitlab.obspm.fr/ROC/RCS/BICAS/-/issues/85
Issue 47 is from Feb 2021 but is still open. Unclear if it can be closed.
BICAS v8.3.0 further updates L2 and L3 skeletons w.r.t.
For L2, this issue should be resolved, in the sense that all requested changes have been implemented and delivered to ROC, and we have not received any complaints back. The only thing to possibly wait for AFAIK is the SOC review of RPW L2 dataset which I assume ROC will do.
AFAIK, there should be a later SOC review of RPW L3 datasets (which I assume ROC will similarily do) which has thus already been at least partly addressed.
Closing issue. Suggesting the creation of a dedicated issue for future L3 skeleton/metadata updates if/when needed in the future.
ROC & SOC wants to update datasets generated by BICAS w.r.t. to metadata standards.
Exactly what this entails is a bit uncertain to me at the time of writing, but I assume it should at least include below:
e-mail, Xavier Bonnin, 2024-04-24:
Update to use CDF version 3.9.
Note: Xavier Bonnin/ROC offers some kind of validation script which one can use: "check_rpw_cdf.py" See e-mail, Xavier Bonnin, 2024-06-12: "[roc.rcs] New release of check_rpw_cdf.py script"