deeplycloudy / lmatools

Python code for working with VHF Lightning Mapping Array data
BSD 2-Clause "Simplified" License
22 stars 23 forks source link

BUG: indexing assumption for flash initial position #4

Open deeplycloudy opened 9 years ago

deeplycloudy commented 9 years ago

It applies to data sorted with the mflash clustering method. It may also apply to the sklearn clustering method, but this has not been verified.

The point data are recorded out of order to the HDF5 file, because I sort the mflash output (which I verified is in time order) by flash_id to split the flashes up. https://github.com/deeplycloudy/lmatools/blob/master/flashsort/autosort/autorun_mflash.py#L114 That sort algorithm, when applied to flash_id, isn’t guaranteed to preserve the original time order.

The time is still calculated correctly, since I take the minimum of the time: https://github.com/deeplycloudy/lmatools/blob/master/flashsort/autosort/flash_stats.py#L142

But the initial position is taken to be the first index, which isn’t guaranteed to be the minimum since the time is not sorted in order: https://github.com/deeplycloudy/lmatools/blob/master/flashsort/autosort/flash_stats.py#L146

The thing to do would be to use argmin instead of min to find the time index that is minimum, and then use that index to get the time and position of the first source.

In summary:

  1. The flash IDs go with the correct sources
  2. The start time is correct
  3. The start lat/lon/alt is incorrect when using mflash
  4. The problem could be corrected by manually recalculating the flash metadata table after applying my suggested fix.
deeplycloudy commented 9 years ago

Credit to K. Thompson of UAH for identifying this bug.