Closed astrojuanlu closed 3 years ago
This is interesting. The xml file cannot be opened by Python itself:
import xml.etree.ElementTree as ElementTree
xml_read_file_path = Path("celestrak_omm.xml")
root = ElementTree.parse(xml_read_file_path).getroot()
This will throw xml.etree.ElementTree.ParseError: unbound prefix: line 1, column 0
.
Digging it further, when you open the xml in Pycharm, you will see a "Namespace xsi not bound" error.
I am probably the least qualified XML expert around, but it looks like the Celestrak OMM file has the ndm tags around it, and then the omm tag. This is not conforming to the standard. You can check the sample standard files here in MOIMS. Furthermore, the CCSDS XML Specs say plainly that you should start with omm tag (see examples at the end) and also:
4.11.1 An OMM instantiation shall be delimited with the <omm></omm> root element tags using the standard attributes documented in 4.3.
I did check Space-Track CDM XML example and this is conforming to the standard (see here, may require login)
To summarise, weird though it is, the celestrak files seem to be non-compliant to the standard. I think I have to contact Dr Kelso.
OK, emailed him, put you in CC. Let's see what happens. :)
I think I'm getting closer to the answer. The XML standard has one obscure use case where the ndm
root tag is used to contain multiple NDM files (e.g. you can wrap 2 CDMs, 3 AEMs and an OMM, because, why not). This is called Combined Instantiation Message. Celestrak wraps its OMMs in an NDM, even though there is a single OMM in the package. This is what confused me in the first place.
The native Python XML reader still hates the Celestrak OMM file, but I was able to get xsdata to read it. The bad news is that, it breaks the API, as NdmIo().readfile()
will have to return a list, even though 99% of the cases in my experience will be a single item (OEM, OMM etc.). I am wide open to ideas as to how to handle this as I am no Python wizard.
OK, so I can now safely read (even with the "auto-detect file type" functionality) the "NDM Combo" files. The downside is that, earlier on you knew that you were reading a single file with a single set of data (aem, omm, whatever). Now the data file could be an NDM file and you have to dig for the data you need. I can make a shortcut and check if I have a single object tree in the NDM file and give the user the content directly. In other words, as the user, you won't know the difference between the Celestrak OMM files and other OMM files. What do you think? Shall go and strip the outer ndm
tag?
If you receive an NDM file with multiple object trees, you still have to look for the exact data that you need though.
Closing this with version 1.2 update, though I'm still open to ideas as to how to handle Combined Instantiation files.
Hmm I would use a different class for it. I am not very inspired for names now, but something like MultiNdmIo().readfile() that can be iterated and return several Ndm objects.
In any case, the fact that the native Python XML library can't read the input straight away is worrying (not within our control though). I wonder if lxml suffers the same?
The question is whether the user knows in advance what sort of file is actually being input. My strategy so far has been:
Continuing this line of thought, my current idea is to read the multi-NDM and the user makes the data type check (if it is multi-NDM, then you have to query whether you have any CDMs in there). The Celestrak OMMs are "single data NDMs" (as they don't seem to pack multiple OMM info). For this sort of "fake multi-NDMs" I can strip the data inside and give it to the user as if that is a normal OMM. This saves the hassle of accessing the ndm.omm[0]
element, when you can directly access omm
. What do you think?
For the other issue, I did try lxml and failed but I will check it again when I can find the time. For me the first alarm bell is that Pycharm throws an error when you open the XML file in the editor. Perhaps what is surprising then is how xsdata manages to read it, as it uses lxml or some other XML reader (switchable).
The Celestrak OMMs are "single data NDMs" (as they don't seem to pack multiple OMM info)
Beware:
https://celestrak.com/NORAD/elements/gp.php?NAME=NUSAT&FORMAT=XML
Thanks for the heads up. :)
In my proposal this would still be interpreted as a multi-ndm, so you would need to access the elements via a ndm.omm[1].body...
On Fri, 4 Dec 2020, 10:01 Juan Luis Cano Rodríguez notifications@github.com wrote:
The Celestrak OMMs are "single data NDMs" (as they don't seem to pack multiple OMM info)
Beware:
https://celestrak.com/NORAD/elements/gp.php?NAME=NUSAT&FORMAT=XML
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/egemenimre/ccsds-ndm/issues/10#issuecomment-738660298, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO7QEK34DZLBVMSNNADHKUTSTCQOJANCNFSM4UDU7ASA .
Got it! Yeah, my bigger point was that all Celestrak OMMs seem to be multi-OMM, even though multi=1 in some cases.
... hence my proposal to strip the "fake-multi" ones into single OMM but not touch the "true-multi" ones. I know this is a bit of an ugly interface (essentially the onus is on the user to know / check for multi/single) but I can't find any other way.
It's a bit like the astropy coordinate objects where you don't know whether there is a single coordinate or a list and the user must check - otherwise gets the first element.
On Fri, 4 Dec 2020, 11:05 Juan Luis Cano Rodríguez notifications@github.com wrote:
Got it! Yeah, my bigger point was that all Celestrak OMMs seem to be multi-OMM, even though multi=1 in some cases.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/egemenimre/ccsds-ndm/issues/10#issuecomment-738693822, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO7QEK2QLPBNYKANZK6E7GDSTCX53ANCNFSM4UDU7ASA .
This is interesting. The xml file cannot be opened by Python itself:
Gave this another go after reading Dr. Kelso email response, and I can't reproduce this locally:
juanlu@ephyra:/tmp$ wget "https://celestrak.com/NORAD/elements/gp.php?CATNR=45018&FORMAT=XML" -O celestrak_omm.xml -q
juanlu@ephyra:/tmp$ head celestrak_omm.xml
<?xml version="1.0" encoding="UTF-8"?>
<ndm xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="https://sanaregistry.org/r/ndmxml/ndmxml-1.0-master.xsd">
<omm id="CCSDS_OMM_VERS" version="2.0">
<header><CREATION_DATE/><ORIGINATOR/></header><body><segment><metadata><OBJECT_NAME>NUSAT-8 (MARIE)</OBJECT_NAME><OBJECT_ID>2020-003C</OBJECT_ID><CENTER_NAME>EARTH</CENTER_NAME><REF_FRAME>TEME</REF_FRAME><TIME_SYSTEM>UTC</TIME_SYSTEM><MEAN_ELEMENT_THEORY>SGP4</MEAN_ELEMENT_THEORY></metadata><data><meanElements><EPOCH>2020-12-06T12:15:46.690848</EPOCH><MEAN_MOTION>15.27889198</MEAN_MOTION><ECCENTRICITY>.0011577</ECCENTRICITY><INCLINATION>97.2994</INCLINATION><RA_OF_ASC_NODE>44.2356</RA_OF_ASC_NODE><ARG_OF_PERICENTER>185.4752</ARG_OF_PERICENTER><MEAN_ANOMALY>288.3378</MEAN_ANOMALY></meanElements><tleParameters><EPHEMERIS_TYPE>0</EPHEMERIS_TYPE><CLASSIFICATION_TYPE>U</CLASSIFICATION_TYPE><NORAD_CAT_ID>45018</NORAD_CAT_ID><ELEMENT_SET_NO>999</ELEMENT_SET_NO><REV_AT_EPOCH>4981</REV_AT_EPOCH><BSTAR>.11839E-3</BSTAR><MEAN_MOTION_DOT>3.155E-5</MEAN_MOTION_DOT><MEAN_MOTION_DDOT>0</MEAN_MOTION_DDOT></tleParameters></data></segment></body></omm>
</ndm>
juanlu@ephyra:/tmp$ python3
Python 3.8.5 (default, Jul 28 2020, 12:59:40)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import xml.etree.ElementTree as ElementTree
>>> from pathlib import Path
>>> xml_read_file_path = Path("celestrak_omm.xml")
>>> root = ElementTree.parse(xml_read_file_path).getroot()
>>> root
<Element 'ndm' at 0x7fb26e0bd0e0>
I used this file from your link but the file I get starts differently?!?! The namespace definition is different.
The file you have attached inline works, but the link you gave doesn't (xml.etree.ElementTree.ParseError: unbound prefix: line 1, column 0
).
Here's the file in the link for me (showing only the beginning):
<ndm xsi:noNamespaceSchemaLocation="https://sanaregistry.org/r/ndmxml/ndmxml-1.0-master.xsd">
<omm id="CCSDS_OMM_VERS" version="2.0">
<header>
<CREATION_DATE/>
<ORIGINATOR/>
</header>
<body>
<segment>
<metadata>
<OBJECT_NAME>NUSAT-8 (MARIE)</OBJECT_NAME>
<OBJECT_ID>2020-003C</OBJECT_ID>
<CENTER_NAME>EARTH</CENTER_NAME>
<REF_FRAME>TEME</REF_FRAME>
<TIME_SYSTEM>UTC</TIME_SYSTEM>
<MEAN_ELEMENT_THEORY>SGP4</MEAN_ELEMENT_THEORY>
</metadata>
<data>
<meanElements>
<EPOCH>2020-12-06T12:15:46.690848</EPOCH>
Let's just keep this open until we get to the bottom of this. :/
Hmm how are you opening the file? In Firefox I get this:
but then I "View Page Source" or Ctrl+U
(notice the URL change):
Holy crap! Yes, this is exactly what is happening! Would you like to reply to the email link (as the one to solve this riddle) to save them from the confusion.
I still have no idea why Firefox is doing that. But it's clear that David is doing the same thing, with the same results.
Thanks for that, you really helped a lot. Now this issue is finally closed.
I will put an explanation in the docs as "Known Issues". Other people will fall for that, I'm sure.
My pleasure! :blush: Thanks to you for following up so quickly!
I tried downloading https://raw.githubusercontent.com/egemenimre/ccsds-ndm/main/ccsds_ndm/tests/data/ndmxml-1.0-omm-2.0.xml and reading it, successfully:
However, it doesn't seem to work for https://celestrak.com/NORAD/elements/gp.php?CATNR=45018&FORMAT=XML: