CCSDSPy / ccsdspy

I/O interface and utilities for CCSDS binary spacecraft data in Python. Library used in flight missions at NASA, NOAA, and SWRI
https://ccsdspy.org
BSD 3-Clause "New" or "Revised" License
78 stars 19 forks source link

Fail Safe on Corrupted Packets, Generate Error Report #71

Open nischayn99 opened 1 year ago

nischayn99 commented 1 year ago

Test Data + Code.zip

nischayn99 commented 1 year ago

Hi @ddasilva,

I've enclosed the test file and the code we were using for parsing the packets.

ddasilva commented 1 year ago

Hi @nischayn99 and @tloubrieu-jpl, can you review a bit of background on this for the myself and the ticket? When I split this file, it has 47 different APIDs in it, but you're trying to parse every APID with the same definition. Do all these APIDs really have the same definition?

A warning is issued that says the file is corrupt, which you mentioned was expected. I think what makes sense to do here is add a keyword argument split_by_apid(mixed_file, truncate_corrupt=True) (it can actually be True by default). When truncate_corrupt=True, it will throw out the extra incomplete packet at the end. This won't save you if data in the middle of your file is corrupt, but this is probably what people want 99% of the time, and if they don't, they can pass truncate_corrupt=False.

As far as error reports, you can convert warnings to exceptions using the warnings module (see how here https://stackoverflow.com/a/30368735). It might help if we throw specific subclasses of CCSDSPyWarning instead of just UserWarning all the time, so you can filter on specific warnings you want to be exceptions. Alternatively, you could route CCSDSPy warnings to a separate log file, and we could work to make sure they are informative.

In general though, the philosophy I think makes sense is to automatically recover with a warning when possible, and throw an exception if not. But we do want to make it easy to collect warnings in a centralize way, so that things are smooth operationally. Let me know any of your thoughts.

nischayn99 commented 1 year ago

Hi @ddasilva,

We tried to apply generic definitions of default packets we found definition in ccsdspy (pkt = VariableLength( [ PacketArray( name="unused", data_type="uint", bit_length=BITS_PER_BYTE, array_shape="expand" ) ] )).

Thought it would work for all packets, and defined specific detailed packet definition for one of the APID as a first development step.

Regarding the corrupted packet in the middle of the file, can we find a way to retrieve the next packet from a marker?

We like the idea of CCSDSPy throwing exceptions and let the user decide if they want to catch the error or not.

ddasilva commented 1 year ago

It's generating an exception because it is sometimes using the housekeeping packet definition for a byte stream that doesn't contain packets that match that. When I replace the code to always use the flexible definition, it doesn't raise an exception:

for apid, stream in stream_by_apid.items():
    pkt = default_pkt
    output[apid] = pkt.load(stream)`

Regarding the corrupted packet in the middle of the file, can we find a way to retrieve the next packet from a marker?

I don't think this is deterministically possible given the CCSDS specification. If you and @tloubrieu-jpl have a way of doing this, I would love to hear it.

We like the idea of CCSDSPy throwing exceptions and let the user decide if they want to catch the error or not.

That sounds good then-- we can make this ticket about that. What I'll do for this issue is two things:

  1. Implement truncate_corrupt=True option for split_by_apid() and a few other functions where it makes sense
  2. Add raise_on_warning=True option for pkt.load(), split_by_apid(), and a few other places where it makes sense