DASDAE / dascore

A python library for distributed fiber optic sensing
Other
72 stars 16 forks source link

Scan not robust to corrupted file #346

Closed d-chambers closed 2 weeks ago

d-chambers commented 7 months ago

Description

@jinwar and his student collected a dataset which has a corrupted file. When attempting to read the time array in this file h5py raises an error stating the metadata checksum is bad. When trying to index this directory with spool.update the error bubbles to the surface and crashes the indexing. This is because dascore.scan is not robust to files which have their file format correctly identified but then raise an error when attempting to read them.

Although the presence of a corrupted file is certainly not something we can control, DASCore should be robust to this, issue a warning, then skip the problematic file and move on with indexing. This may be a bit tricky to test as the corrupted file is too large to include in the test suite. So, we could either:

  1. Muck about with the bytes of a small test file until we can produce this same error, or
  2. Create a test in which we monkey patch one of the formatter scan functions to raise an error if a specific file is given

Example

-- Can't currently reproduce with a self contained code snippet.

Versions