CCSDSPy / ccsdspy

I/O interface and utilities for CCSDS binary spacecraft data in Python. Library used in flight missions at NASA, NOAA, and SWRI
https://ccsdspy.org
BSD 3-Clause "New" or "Revised" License
75 stars 18 forks source link

Add split_by_apid function and command line interface #14

Closed ddasilva closed 1 year ago

ddasilva commented 2 years ago

This merge request adds a function split_by_apid() which loops through packets in a file of mixed APIDs and sorts the packets into a dictionary where the keys are the integer APID and the values are homogenous streams for each APID.

There is also a new command line interface exposed through python -m ccsdspy which allows splitting a file and writing individual files.

   $ python -m ccsdspy split mixed_file.tlm
   Parsing done!
   Writing ./apid00132.tlm...
   Writing ./apid00134.tlm...
   Writing ./apid00258.tlm...
   Writing ./apid00384.tlm...
   Writing ./apid00385.tlm...
   Writing ./apid00386.tlm...
   Writing ./apid00387.tlm...
ehsteve commented 1 year ago

I also agree that this is a great addition! This seems to require that each of the packets be fixed length since the packets are parsed but simply splitting the stream up and outputting the stream into individual files should not require parsing the packets only finding the edges of the packets. I think this functionality would be more versatile if it did not require the stream to only include fixed length packets.

ehsteve commented 1 year ago

I agree with @jmbhughes that there should be tests. I manage the HERMES SOC Org. and we've set up automatic code coverage (see here for an example. I'd be happy to help get it set up if you're interested.

ehsteve commented 1 year ago

Reading the code more closely, it does seem like this would work for variable length packets. I was wrong in that the packets are not parsed. Only the ap_id and the packet length.

ddasilva commented 1 year ago

Thanks for the comments everyone! I will get to addressing these issues. I really appreciate the help and support you've all shown in this merge request.

@ehsteve Since you brought up topic of variable length packets-- this is something I want to support, but I haven't been able to get my hands on any variable length packet data (I have never worked with them myself, and no one thus far has had anything they could send me). If you have any data you can share, we can look into supporting them via a VariableLength class.

ddasilva commented 1 year ago

I addressed the comments in this repush over the existing branch. I wrote a test and it works, but it uses the data that @jmbhughes gave me. Do you give me permission to include that in the repo as test data? I can obfuscate the field names and packet names if that helps. It turns out that the test data we have has a lot of internal consistency problems... if I can't use this I'd have to write synthetic data from scratch.

jmbhughes commented 1 year ago

It's fine to use the data I shared as test data. It's kind of big though (83 MB). So maybe you should only use the first n packets, say 1,000 packets (I'm not sure how big that would be). That still gives an example of a mixed stream and saves space.