CameronBodine / PINGMapper

Open-source interface for processing recreation-grade side scan sonar datasets and reproducibly mapping benthic habitat
https://cameronbodine.github.io/PINGMapper/
MIT License
39 stars 3 forks source link

Corrupt recordings w/ missing data #33

Closed CameronBodine closed 1 year ago

CameronBodine commented 2 years ago

Sonar recordings can have missing data from the sonar channels, likely due to low voltage and/or sd card write speed. Missing data is identifiable from the sonar record number. Record numbers are unique for each ping, following the pattern below (but the order is not consistent):

  1. B001 (high freq downscan)
  2. B002 (port)
  3. B003 (star)
  4. B004 (mega downscan)
  5. B001
  6. B002
  7. etc.

Add function to identify missing pings and fill ping attribute csv with nan's for record number, and pull other attributes from nearby pings from other channels. When exporting imagery, fill image with nan's where there is missing data. This will make spatially accurate geotiffs in addition to salvaging valid data from an otherwise corrupt recording.

CameronBodine commented 1 year ago

Record_num: The number of active beams determines how the record number increments, at least for the side scan channels. Side scan channels always ping one after the other.

Ex: 4 beams == increment by 4 B001: 0, 4, 8, 12 ... B004: 1, 5, 9, 13 ... B002: 2, 6, 10, 14 ... B003: 3, 7, 11, 15 ...

A record number is never missed even if a beam doesn't ping, so that can't be used to identify missing pings. Will have to use the increment pattern to identify nodata locations.

Look at each beam in turn, if (record_num+1) - record_num != increment_val, one of the beams is missing data. If >increment_val, the current beam is missing data, else one of the other beams is missing data.

CameronBodine commented 1 year ago

Functionality for finding where a ping from any beam may be missing is complete and needs to be tested, so won't close for now.

This fix applies an assumption that every active beam ping's the name number of times, one right after the other. The workflow iterates all ping attributes from b number of beams. For every b number of pings, a ping must be present from each active ping, termed here a "ping packet", as shown below:

Ping packet for b==4: Original Record Number Beam Packet
0 B000 1
1 B001 1
2 B002 1
3 B003 1

The algorithm ensures that each ping packet has a ping from each active beam. If not, the algorithm will insert a ping placeholder and copy all the attributes from the previous ping in the current ping packet. The index (byte offset of record in SON file) is set to nan in the copied ping, flagging the location for subsequent workflows. See below for example:

Before adding no data: Original Record Number Beam Packet Index
0 B000 1 0
1 B001 1 0
2 B002 1 0
3 B003 1 0
4 B000 2 1455
5 B001 2 1455
6 B003 2 1455
7 B000 3 3600
8 B001 3 3600
.. .. .. ..
After adding no data: Original Record Number Beam Packet Index New Record Number
0 B000 1 0 0
1 B001 1 0 1
2 B002 1 0 2
3 B003 1 0 3
4 B000 2 1455 4
5 B001 2 1455 5
5 B002 2 nan 6
6 B003 2 1455 7
7 B000 3 3600 8
8 B001 3 3600 9
.. .. .. .. ..

The search is not very efficient, but seems to work with data tested so far.

The flagged NoData is then used during image export and rectification, adding 0 filler intensities. This helps properly space and relocate pings in the imagery, and ensures no stretching of sonar features in small gaps of missing data.