Open juzen2003 opened 1 year ago
Modifications made in https://github.com/SETI/rms-hst-pipeline/pull/76:
hst:moving_target_keyword
to hst:moving_target_keywords
nilReason
in hst:gain_setting
tag (attribute is not allowed)unit
in hst:plate_scale
tag (attribute is not allowed)nilReason
in hst:spectral_resolution
tag (attribute is not allowed)nilReason
in hst:center_filter_wavelength
tag (attribute is not allowed)nilReason
in hst:bandwidth
tag (attribute is not allowed)unit="byte"
to record_length
, field_location
, and field_location
tagshst:observation_type
from SPECTROSCOPIC
back to SPECTROGRAPHIC
Array_1D
to Array
File_Area
to File_Area_Ancillary
tag if processing level is "Ancillary"logical_identifier
tag and its content in one line to avoid validator complaining about the unexpected carriage returnsIssues that needs to be reviewed and discussed:
hst:gain_setting
, hst:spectral_resolution
, hst:center_filter_wavelength
, and hst:bandwidth
values need to match this pattern r'(\+|-)?([0-9]+(\.[0-9]*)?|\.[0-9]+)([Ee](\+|-)?[0-9]+)?|[^aFIN,]* '
, what should we put if it's not applicable? Remove the tag? (empty string is not a valid value)hst:visit_id
& hst:Processing_Parameters
hst:observation_type
, it must exists, and the validator requires the value to be one of these: 'IMAGING', 'SPECTROGRAPHIC', 'TIME-SERIES', so 'UNK' will fail the validator.{Array, Array_2D, Array_2D_Image, Array_2D_Map, Array_2D_Spectrum, Array_3D, Array_3D_Image, Array_3D_Movie, Array_3D_Spectrum, Checksum_Manifest, Encoded_Header, Encoded_Image, Header, Stream_Text, Table_Binary, Table_Character, Table_Delimited}
. Should we modify DATA_CLASS_TO_NOUN
in hdu_data_descriptions.py
? What value should we put for Array_1D_Spectrum
?Current changes: (updates after 1/19/24 meeting)
PRODUCT_LABEL.xml
hst:moving_target_keyword
to hst:moving_target_keywords
unit="byte"
to record_length
, field_location
, and field_location
tagshst_dictionary_support.py
hst:observation_type
value SPECTROSCOPIC
to SPECTROGRAPHIC
File_Area
to File_Area_Ancillary
tag if processing level is "Ancillary"logical_identifier
tag and its content in one line to avoid validator complaining about the unexpected carriage returnsPending items:
Update on 5/23/24:
Open issues related to label errors raised by the validator with the new dictioanry PDS4_HST_1H00_1000
:
unit
(value mrad/pixel
) attribute for hst:plate_scale
(currently the code will workaround this by removing the unit attribute as discussed in 5/4/24 hst slack chat, will put it back later with updated dicitonary)Array_1D
tag for Array_1D_Spectrum
tag, or update dictionary to allow Array_1D_Spectrum
tag?
Dropbox/Shared-pdart/bundles_from_hst_pipeline/hst_05167/hst_05167-deliverable/miscellaneous_ghrs_shf/visit_01/z2no0101t_shf.xml
<Array_1D_Spectrum>
<name>Primary FITS data object</name>
<local_identifier>fits_data_object_0</local_identifier>
<offset unit="byte">23040</offset>
<axes>1</axes>
<axis_index_order>Last Index Fastest</axis_index_order>
<description>
Primary FITS data object: Standard header packet for this GHRS/D2 observation.
</description>
<Element_Array>
<data_type>SignedMSB2</data_type>
</Element_Array>
<Axis_Array>
<axis_name>Sample</axis_name>
<elements>965</elements>
<sequence_number>1</sequence_number>
</Axis_Array>
</Array_1D_Spectrum>
ERROR [error.label.schema] line 226, 24: cvc-complex-type.2.4.a: Invalid content was found starting with element '{"http://pds.nasa.gov/pds4/pds/v1":Array_1D_Spectrum}'. One of '{"http://pds.nasa.gov/pds4/pds/v1":Array, "http://pds.nasa.gov/pds4/pds/v1":Array_1D, "http://pds.nasa.gov/pds4/pds/v1":Array_2D, "http://pds.nasa.gov/pds4/pds/v1":Array_2D_Image, "http://pds.nasa.gov/pds4/pds/v1":Array_2D_Map, "http://pds.nasa.gov/pds4/pds/v1":Array_2D_Spectrum, "http://pds.nasa.gov/pds4/pds/v1":Array_3D, "http://pds.nasa.gov/pds4/pds/v1":Array_3D_Image, "http://pds.nasa.gov/pds4/pds/v1":Array_3D_Movie, "http://pds.nasa.gov/pds4/pds/v1":Array_3D_Spectrum, "http://pds.nasa.gov/pds4/pds/v1":Checksum_Manifest, "http://pds.nasa.gov/pds4/pds/v1":Encoded_Audio, "http://pds.nasa.gov/pds4/pds/v1":Encoded_Header, "http://pds.nasa.gov/pds4/pds/v1":Encoded_Image, "http://pds.nasa.gov/pds4/pds/v1":Header, "http://pds.nasa.gov/pds4/pds/v1":Stream_Text, "http://pds.nasa.gov/pds4/pds/v1":Table_Binary, "http://pds.nasa.gov/pds4/pds/v1":Table_Character, "http://pds.nasa.gov/pds4/pds/v1":Table_Delimited}' is expected.
Input required: Some value of title
tag under Identification_Area
tag is too long, need to modified PRODUCT_LABEL.xml
or increase the max length of the value for title
tag.
Reference label: Dropbox/Shared-pdart/bundles_from_hst_pipeline/hst_05167/hst_05167-deliverable/miscellaneous_ghrs_shf/visit_01/z2no0101t_shf.xml
<Identification_Area>
<logical_identifier>urn:nasa:pds:hst_5167:miscellaneous_ghrs_shf:z2no0101t</logical_identifier>
<version_id>1.0</version_id>
<title>
z2no0101t_shf.fits: Standard header packet file, containing observation parameters,
for this GHRS/D2 observation from HST Program 5167.
Note that observation "z2no0101t" did not obtain science data. Only ancillary data
files documenting this activity are available.
</title>
...
ERROR [error.label.schema] line 23, 12: cvc-maxLength-valid: Value 'z2no0101t_shf.fits: Standard header packet file, containing observation parameters, for this GHRS/D2 observation from HST Program 5167. Note that observation "z2no0101t" did not obtain science data. Only ancillary data files documenting this activity are available.' with length = '265' is not facet-valid with respect to maxLength '255' for type 'title'.
ERROR [error.label.schema] line 23, 12: cvc-type.3.1.3: The value 'z2no0101t_shf.fits: Standard header packet file, containing observation parameters, for this GHRS/D2 observation from HST Program 5167. Note that observation "z2no0101t" did not obtain science data. Only ancillary data files documenting this activity are available.' of element 'title' is not valid.
Observing_System
tag, should we wrap the name
tag with Observing_System_Component
? (but if it's wrapped by Observing_System_Component
, the validator will expect more tag like type
to be added as well which will be weird in this case)
Dropbox/Shared-pdart/bundles_from_hst_pipeline/hst_05167/hst_05167-deliverable/bundle.xml
<Observing_System>
<name>Hubble Space Telescope Goddard High Resolution Spectrograph</name>
<Observing_System_Component>
<name>Hubble Space Telescope</name>
<type>Host</type>
<Internal_Reference>
<lid_reference>urn:nasa:pds:context:instrument_host:spacecraft.hst</lid_reference>
<reference_type>is_instrument_host</reference_type>
</Internal_Reference>
</Observing_System_Component>
<Observing_System_Component>
<name>Goddard High Resolution Spectrograph</name>
<type>Instrument</type>
<Internal_Reference>
<lid_reference>urn:nasa:pds:context:instrument:hst.ghrs</lid_reference>
<reference_type>is_instrument</reference_type>
</Internal_Reference>
</Observing_System_Component>
<name>Hubble Space Telescope Wide Field and Planetary Camera 2</name>
<Observing_System_Component>
<name>Hubble Space Telescope</name>
<type>Host</type>
<Internal_Reference>
<lid_reference>urn:nasa:pds:context:instrument_host:spacecraft.hst</lid_reference>
<reference_type>is_instrument_host</reference_type>
</Internal_Reference>
</Observing_System_Component>
...
ERROR [error.label.schema] line 78, 19: cvc-complex-type.2.4.a: Invalid content was found starting with element '{"http://pds.nasa.gov/pds4/pds/v1":name}'. One of '{"http://pds.nasa.gov/pds4/pds/v1":Observing_System_Component}' is expected.
hst:moving_target_description
is too long, need to update the dictionary to increase the max length of the value for hst:moving_target_description
tag?
Dropbox/Shared-pdart/bundles_from_hst_pipeline/hst_16310/hst_16310-deliverable/miscellaneous_wfc3_jit/visit_01/ieab01fwj_jit.xml
<hst:Pointing_Parameters>
<hst:hst_target_name>2I-BORISOV</hst:hst_target_name>
<hst:moving_target_flag>true</hst:moving_target_flag>
<hst:moving_target_keyword>COMET</hst:moving_target_keyword>
<hst:moving_target_keyword>interstellar comet</hst:moving_target_keyword>
<hst:moving_target_description>
TYPE=COMET, Q=2.006581893840375, E=3.356215101434632, I=44.05257068647377,
O=308.1487262895379, W=209.12367864468, T=08-DEC-2019:13:04:54,
TTimeScale=TDB, EQUINOX=J2000, EPOCH=01-AUG-2020:00:00:00, EpochTimeScale=TDB,
R0=2.808, DT=87.2916, A1=7.093444347382E-8, A2=-1.443811535835E-8,
A3=6.534734368324E-10, ALN=0.1112620426, NM=2.15, NN=
</hst:moving_target_description>
...
ERROR [error.label.schema] line 140, 42: cvc-maxLength-valid: Value 'TYPE=COMET, Q=2.006581893840375, E=3.356215101434632, I=44.05257068647377,O=308.1487262895379, W=209.12367864468, T=08-DEC-2019:13:04:54, TTimeScale=TDB, EQUINOX=J2000, EPOCH=01-AUG-2020:00:00:00, EpochTimeScale=TDB, R0=2.808, DT=87.2916, A1=7.093444347382E-8, A2=-1.443811535835E-8, A3=6.534734368324E-10, ALN=0.1112620426, NM=2.15, NN=' with length = '337' is not facet-valid with respect to maxLength '255' for type 'ASCII_Short_String_Collapsed'.
ERROR [error.label.schema] line 140, 42: cvc-complex-type.2.2: Element 'hst:moving_target_description' must have no element [children], and the value must be valid.
1) I just submitted a SCR (change request) to the DDWG asking for mrad/pixel to be added to the dictionary. In the meantime, I agree with the workaround articulated by @juzen2003.
2) The Slack thread on 5/10/24 ended with me asking @markshowalter if there is any reason why we shouldn’t simply use Array_1D
. He has not responded. I’ll put it on the agenda for our meeting this coming Tuesday.
3) Identification_Area
is part of the core model, which we cannot modify unless we file a change request with the DDWG. Furthermore, it seems quite reasonable that a title
should be limited to 255 characters. @markshowalter: Can some of the information in this title be moved to a comment?
4) This label snippet indeed seems faulty, and I think it's good that Validator has caught it. Any attribute name
must be inside the class (in this case, Observing_System_Component
) of which it is giving the name! In this case, I notice that a) <name>Hubble Space Telescope Goddard High Resolution Spectrograph</name>
is outside the instance of Observing_System_Component
that contains the LID for that instrument; it should be inside the class, but there is already a duplicate name inside that class, so decide which one to use and discard the other one; b) <name>Hubble Space Telescope Wide Field and Planetary Camera 2</name>
is also outside any instance of Observing_System_Component
, but in fact there is no instance that includes the LID for WFPC2 -- there should be one if that instrument is relevant to this data product, or perhaps it was included extraneously. Finally, c) There are two apparently duplicate instances of Observing_System_Component
that include the LID for the entire spacecraft HST -- there should be only one, unless there is a reason that I'm missing. We need input from @markshowalter on how to address this, so I'll bring it up at our meeting on Tuesday.
5) Hmm, I don't see anything in the dictionary that explicitly limits the length to 255 characters, unless it is the value_data_type
of ASCII_Short_String_Collapsed
. Anyway, this is part of the HST dictionary, which is under our control, so we have freedom to adjust as seems good. Let's consult Tuesday on this one also.
Following up:
1) No change from previous answer. In the best-case it will be six months before this is fixed. Current text will eventually be okay, but in the meantime Dave to remove unit attribute.
2) Mark gave a good reason, and I have started sounding out the DDWG about creating Array_1D_Spectrum
. The change may or may not go through, and the best-case is that it will be six months before it's fixed. In the meantime, let's change it to Array_1D
and see whether it validates. Current text may or may not eventually be okay, but in the meantime Dave to update.
3) Mark says that, when there is a "note" that is part of the title, that text should go into a comment instead of being part of the title. This should keep our titles below 255 characters, which is a limit we cannot change. @mace-space suggests that <Context_Area>.<comment>
is the right place for this text. Dave to fix.
4) The label here is garbled, and needs a software solution. I thought it was weird that there are two instruments (GHRS and WFPC2) mentioned in one label, but I've just realized that the example is a bundle.xml
file, so it might be that data from both instruments are in the bundle so that both instruments really do belong. In any case, there should be only one instance of Observing_System_Component
for the spacecraft itself (I see two) and there should be one instance of Observing_System_Component
for each instrument. All instances of name
should be inside the Observing_System_Component
class. Dave to fix.
5) @mit3ch confirms that this is a fix that I can make. I just need to change the value_data_type
for hst:moving_target_description
from ASCII_Short_String_Collapsed
to ASCII_String
. I've now done that, so this problem should disappear. Matt has fixed.
Updates implemented in https://github.com/SETI/rms-hst-pipeline/pull/76 based on previous comments (06/17/24)
unit
attribute from hst:plate_scale
Array_1D
for hdu data class Array_1D_Spectrum
for now, will change it back to Array_1D_Spectrum
once DDWG has an update.<title>
tag content under <Identification_Area>
tag to <Context_Area>.<comment>
<Observing_System_Component>
tag under <Observing_System>
in the bundle, product collection, and product label templates. Now there is one instance of Observing_System_Component
for the spacecraft itself (host) and one instance of Observing_System_Component
for each instrument.
browse_nicmos_raw/visit_01/n4wl01xqq_raw.xml
in7885
as an example: