MillionConcepts / pdr

[P]lanetary [D]ata [R]eader - A single function to read all Planetary Data System (PDS) data into Python
Other
60 stars 6 forks source link

refactor read_table #27

Closed m-stclair closed 2 years ago

m-stclair commented 2 years ago

several options are possible. one leading candidate is:

  1. attempt to infer whether the table is text or binary. This could be a condition like: are there ASCII data types in the format definition OR does the first line of the table decode correctly as ASCII/Unicode?
  2. If the table is text, refer to a mapping of text data type names to python types. Then load it with numpy.loadtxt or or whatever based on those types and column names, punting to other parsers if that fails.
  3. if the table is binary, look up sample types as struct-style definitions
michaelaye commented 2 years ago
m-stclair commented 2 years ago

Yes, I am only referring to data tables here.

There is one very common set of cases in which the label is insufficient to read a data table: when labels refer to external format specification files that are not locally available.

There are also cases in which binary tables have underspecified / ambiguous data types due to ambiguities in the PDS3 standards (endianness is not always clear, for instance).

I have also found some instances in older data sets in which specifications are nonstandard or missing, although I can’t cite them off the top of my head.

m-stclair commented 2 years ago

changes made to support several MSL and MRO datasets have solved many of the major issues referenced in this issue. I am closing it for now in favor of opening more specific issues.