Cannot read OMM from Celestrak

astrojuanlu commented 3 years ago

I tried downloading https://raw.githubusercontent.com/egemenimre/ccsds-ndm/main/ccsds_ndm/tests/data/ndmxml-1.0-omm-2.0.xml and reading it, successfully:

>>> import ccsds_ndm
>>> ccsds_ndm.__version__
'1.1'
>>> from ccsds_ndm.ndm_io import NdmIo
>>> from pathlib import Path
>>> NdmIo().from_path(Path("ndmxml-1.0-omm-2.0.xml"))
Omm(header=NdmHeader(comment=['THIS EXAMPLE CONFORMS TO FIGURE 4-2 IN 502.0-B-2'], creation_date='2007-065T16:00:00', originator='NOAA/USA'), body=OmmBody(segment=OmmSegment(metadata=OmmMetadata(comment=[], object_name='GOES-9', object_id='1995-025A', center_name='EARTH', ref_frame='TEME', ref_frame_epoch=None, time_system='UTC', mean_element_theory='TLE'), data=OmmData(comment=['USAF SGP4 IS THE ONLY PROPAGATOR THAT SHOULD BE USED FOR THIS DATA'], mean_elements=MeanElementsType(comment=[], epoch='2007-064T10:34:41.4264', semi_major_axis=None, mean_motion=RevType(value=Decimal('1.00273272'), units=None), eccentricity=Decimal('0.0005013'), inclination=InclinationType(value=Decimal('3.0539'), units=None), ra_of_asc_node=AngleType(value=Decimal('81.7939'), units=None), arg_of_pericenter=AngleType(value=Decimal('249.2363'), units=None), mean_anomaly=AngleType(value=Decimal('150.1602'), units=None), gm=GmType(value=Decimal('398600.8'), units=None)), spacecraft_parameters=None, tle_parameters=TleParametersType(comment=[], ephemeris_type=None, classification_type=None, norad_cat_id=23581, element_set_no=925, rev_at_epoch=4316, bstar=BStarType(value=Decimal('0.0001'), units=None), mean_motion_dot=DRevType(value=Decimal('-0.00000113'), units=None), mean_motion_ddot=DdRevType(value=Decimal('0.0'), units=None)), covariance_matrix=None, user_defined_parameters=UserDefinedType(comment=[], user_defined=[UserDefinedParameterType(value='xyz', parameter='ABC0'), UserDefinedParameterType(value='9', parameter='ABC1'), UserDefinedParameterType(value='xyz', parameter='ABC2'), UserDefinedParameterType(value='9', parameter='ABC3'), UserDefinedParameterType(value='xyz', parameter='ABC4'), UserDefinedParameterType(value='9', parameter='ABC5'), UserDefinedParameterType(value='xyz', parameter='ABC6'), UserDefinedParameterType(value='9', parameter='ABC7'), UserDefinedParameterType(value='xyz', parameter='ABC8'), UserDefinedParameterType(value='9', parameter='ABC9')])))), id='CCSDS_OMM_VERS', version='2.0')

However, it doesn't seem to work for https://celestrak.com/NORAD/elements/gp.php?CATNR=45018&FORMAT=XML:

$ wget "https://celestrak.com/NORAD/elements/gp.php?CATNR=45018&FORMAT=XML" -O celestrak_omm.xml -q
$ python -q
>>> from ccsds_ndm.ndm_io import NdmIo
>>> from pathlib import Path
>>> NdmIo().from_path(Path("celestrak_omm.xml"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/juanlu/.pyenv/versions/poliastro38/lib/python3.8/site-packages/ccsds_ndm/ndm_io.py", line 96, in from_path
    data_type = _NdmDataType.find_element(root.attrib.get("id")).clazz
AttributeError: 'NoneType' object has no attribute 'clazz'

egemenimre commented 3 years ago

This is interesting. The xml file cannot be opened by Python itself:

import xml.etree.ElementTree as ElementTree

xml_read_file_path = Path("celestrak_omm.xml")
root = ElementTree.parse(xml_read_file_path).getroot()

This will throw xml.etree.ElementTree.ParseError: unbound prefix: line 1, column 0.

Digging it further, when you open the xml in Pycharm, you will see a "Namespace xsi not bound" error.

I am probably the least qualified XML expert around, but it looks like the Celestrak OMM file has the ndm tags around it, and then the omm tag. This is not conforming to the standard. You can check the sample standard files here in MOIMS. Furthermore, the CCSDS XML Specs say plainly that you should start with omm tag (see examples at the end) and also: 4.11.1 An OMM instantiation shall be delimited with the <omm></omm> root element tags using the standard attributes documented in 4.3.

I did check Space-Track CDM XML example and this is conforming to the standard (see here, may require login)

To summarise, weird though it is, the celestrak files seem to be non-compliant to the standard. I think I have to contact Dr Kelso.

egemenimre commented 3 years ago

OK, emailed him, put you in CC. Let's see what happens. :)

egemenimre commented 3 years ago

I think I'm getting closer to the answer. The XML standard has one obscure use case where the ndm root tag is used to contain multiple NDM files (e.g. you can wrap 2 CDMs, 3 AEMs and an OMM, because, why not). This is called Combined Instantiation Message. Celestrak wraps its OMMs in an NDM, even though there is a single OMM in the package. This is what confused me in the first place.

The native Python XML reader still hates the Celestrak OMM file, but I was able to get xsdata to read it. The bad news is that, it breaks the API, as NdmIo().readfile() will have to return a list, even though 99% of the cases in my experience will be a single item (OEM, OMM etc.). I am wide open to ideas as to how to handle this as I am no Python wizard.

egemenimre commented 3 years ago

OK, so I can now safely read (even with the "auto-detect file type" functionality) the "NDM Combo" files. The downside is that, earlier on you knew that you were reading a single file with a single set of data (aem, omm, whatever). Now the data file could be an NDM file and you have to dig for the data you need. I can make a shortcut and check if I have a single object tree in the NDM file and give the user the content directly. In other words, as the user, you won't know the difference between the Celestrak OMM files and other OMM files. What do you think? Shall go and strip the outer ndm tag?

If you receive an NDM file with multiple object trees, you still have to look for the exact data that you need though.

egemenimre commented 3 years ago

Closing this with version 1.2 update, though I'm still open to ideas as to how to handle Combined Instantiation files.

astrojuanlu commented 3 years ago

Hmm I would use a different class for it. I am not very inspired for names now, but something like MultiNdmIo().readfile() that can be iterated and return several Ndm objects.

In any case, the fact that the native Python XML library can't read the input straight away is worrying (not within our control though). I wonder if lxml suffers the same?

egemenimre commented 3 years ago

The question is whether the user knows in advance what sort of file is actually being input. My strategy so far has been:

make sure the file can be opened (do not force the user to specify the file type in advance)
the user then checks the tag to make sure she really has read a CDM (as opposed to an AEM)
If the file type is correct, probably the user maps the data into her object

Continuing this line of thought, my current idea is to read the multi-NDM and the user makes the data type check (if it is multi-NDM, then you have to query whether you have any CDMs in there). The Celestrak OMMs are "single data NDMs" (as they don't seem to pack multiple OMM info). For this sort of "fake multi-NDMs" I can strip the data inside and give it to the user as if that is a normal OMM. This saves the hassle of accessing the ndm.omm[0] element, when you can directly access omm. What do you think?

For the other issue, I did try lxml and failed but I will check it again when I can find the time. For me the first alarm bell is that Pycharm throws an error when you open the XML file in the editor. Perhaps what is surprising then is how xsdata manages to read it, as it uses lxml or some other XML reader (switchable).

astrojuanlu commented 3 years ago

The Celestrak OMMs are "single data NDMs" (as they don't seem to pack multiple OMM info)

Beware:

https://celestrak.com/NORAD/elements/gp.php?NAME=NUSAT&FORMAT=XML

egemenimre commented 3 years ago

Thanks for the heads up. :)

In my proposal this would still be interpreted as a multi-ndm, so you would need to access the elements via a ndm.omm[1].body...

On Fri, 4 Dec 2020, 10:01 Juan Luis Cano Rodríguez notifications@github.com wrote:

The Celestrak OMMs are "single data NDMs" (as they don't seem to pack multiple OMM info)

Beware:

https://celestrak.com/NORAD/elements/gp.php?NAME=NUSAT&FORMAT=XML

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/egemenimre/ccsds-ndm/issues/10#issuecomment-738660298, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO7QEK34DZLBVMSNNADHKUTSTCQOJANCNFSM4UDU7ASA .

astrojuanlu commented 3 years ago

Got it! Yeah, my bigger point was that all Celestrak OMMs seem to be multi-OMM, even though multi=1 in some cases.

egemenimre commented 3 years ago

... hence my proposal to strip the "fake-multi" ones into single OMM but not touch the "true-multi" ones. I know this is a bit of an ugly interface (essentially the onus is on the user to know / check for multi/single) but I can't find any other way.

It's a bit like the astropy coordinate objects where you don't know whether there is a single coordinate or a list and the user must check - otherwise gets the first element.

On Fri, 4 Dec 2020, 11:05 Juan Luis Cano Rodríguez notifications@github.com wrote:

Got it! Yeah, my bigger point was that all Celestrak OMMs seem to be multi-OMM, even though multi=1 in some cases.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/egemenimre/ccsds-ndm/issues/10#issuecomment-738693822, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO7QEK2QLPBNYKANZK6E7GDSTCX53ANCNFSM4UDU7ASA .

astrojuanlu commented 3 years ago

This is interesting. The xml file cannot be opened by Python itself:

Gave this another go after reading Dr. Kelso email response, and I can't reproduce this locally:

juanlu@ephyra:/tmp$ wget "https://celestrak.com/NORAD/elements/gp.php?CATNR=45018&FORMAT=XML" -O celestrak_omm.xml -q
juanlu@ephyra:/tmp$ head celestrak_omm.xml 
<?xml version="1.0" encoding="UTF-8"?>
<ndm xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="https://sanaregistry.org/r/ndmxml/ndmxml-1.0-master.xsd">
<omm id="CCSDS_OMM_VERS" version="2.0">
<header><CREATION_DATE/><ORIGINATOR/></header><body><segment><metadata><OBJECT_NAME>NUSAT-8 (MARIE)</OBJECT_NAME><OBJECT_ID>2020-003C</OBJECT_ID><CENTER_NAME>EARTH</CENTER_NAME><REF_FRAME>TEME</REF_FRAME><TIME_SYSTEM>UTC</TIME_SYSTEM><MEAN_ELEMENT_THEORY>SGP4</MEAN_ELEMENT_THEORY></metadata><data><meanElements><EPOCH>2020-12-06T12:15:46.690848</EPOCH><MEAN_MOTION>15.27889198</MEAN_MOTION><ECCENTRICITY>.0011577</ECCENTRICITY><INCLINATION>97.2994</INCLINATION><RA_OF_ASC_NODE>44.2356</RA_OF_ASC_NODE><ARG_OF_PERICENTER>185.4752</ARG_OF_PERICENTER><MEAN_ANOMALY>288.3378</MEAN_ANOMALY></meanElements><tleParameters><EPHEMERIS_TYPE>0</EPHEMERIS_TYPE><CLASSIFICATION_TYPE>U</CLASSIFICATION_TYPE><NORAD_CAT_ID>45018</NORAD_CAT_ID><ELEMENT_SET_NO>999</ELEMENT_SET_NO><REV_AT_EPOCH>4981</REV_AT_EPOCH><BSTAR>.11839E-3</BSTAR><MEAN_MOTION_DOT>3.155E-5</MEAN_MOTION_DOT><MEAN_MOTION_DDOT>0</MEAN_MOTION_DDOT></tleParameters></data></segment></body></omm>
</ndm>
juanlu@ephyra:/tmp$ python3
Python 3.8.5 (default, Jul 28 2020, 12:59:40) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import xml.etree.ElementTree as ElementTree
>>> from pathlib import Path
>>> xml_read_file_path = Path("celestrak_omm.xml")
>>> root = ElementTree.parse(xml_read_file_path).getroot()
>>> root
<Element 'ndm' at 0x7fb26e0bd0e0>

egemenimre commented 3 years ago

I used this file from your link but the file I get starts differently?!?! The namespace definition is different.

The file you have attached inline works, but the link you gave doesn't (xml.etree.ElementTree.ParseError: unbound prefix: line 1, column 0).

Here's the file in the link for me (showing only the beginning):

<ndm xsi:noNamespaceSchemaLocation="https://sanaregistry.org/r/ndmxml/ndmxml-1.0-master.xsd">
<omm id="CCSDS_OMM_VERS" version="2.0">
<header>
<CREATION_DATE/>
<ORIGINATOR/>
</header>
<body>
<segment>
<metadata>
<OBJECT_NAME>NUSAT-8 (MARIE)</OBJECT_NAME>
<OBJECT_ID>2020-003C</OBJECT_ID>
<CENTER_NAME>EARTH</CENTER_NAME>
<REF_FRAME>TEME</REF_FRAME>
<TIME_SYSTEM>UTC</TIME_SYSTEM>
<MEAN_ELEMENT_THEORY>SGP4</MEAN_ELEMENT_THEORY>
</metadata>
<data>
<meanElements>
<EPOCH>2020-12-06T12:15:46.690848</EPOCH>

egemenimre commented 3 years ago

Let's just keep this open until we get to the bottom of this. :/

astrojuanlu commented 3 years ago

Hmm how are you opening the file? In Firefox I get this:

Screenshot from 2020-12-06 21-12-52

but then I "View Page Source" or Ctrl+U (notice the URL change):

Screenshot from 2020-12-06 21-13-40

egemenimre commented 3 years ago

Holy crap! Yes, this is exactly what is happening! Would you like to reply to the email link (as the one to solve this riddle) to save them from the confusion.

I still have no idea why Firefox is doing that. But it's clear that David is doing the same thing, with the same results.

egemenimre commented 3 years ago

Thanks for that, you really helped a lot. Now this issue is finally closed.

I will put an explanation in the docs as "Known Issues". Other people will fall for that, I'm sure.

astrojuanlu commented 3 years ago

My pleasure! :blush: Thanks to you for following up so quickly!

egemenimre / ccsds-ndm

Cannot read OMM from Celestrak #10