lacker / seticore

A high-performance implementation of some core SETI algorithms that can be included in other programs.
MIT License
3 stars 7 forks source link

Strange hits-per-beam trend in hits files #28

Closed david-macmahon closed 1 year ago

david-macmahon commented 1 year ago

Hit files produced at MeerKAT have an unexpected linear trend in hits-per-beam. Beam 0 has the most hits and higher numbered beams have fewer and fewer hits (except for beams that have zero hits). Here is a plot showing the number of hits for each beam from a recent hits file. I think this is representative of the hits files in general.

image
david-macmahon commented 1 year ago

Here is a plot showing the beam number for each of the 1613 hits from that same file (blpn8:/scratch/data/20230613T194531Z-20230126-0054/seticore_search/guppi_60108_71131_000104_J1150-0023_0001.hits):

image
david-macmahon commented 1 year ago

Correction, beam 2 is the lowest numbered beam with any hits in this file and it has the most hits. Beam 0 and beam 1 have no hits in this file.

david-macmahon commented 1 year ago

Another curiosity is that all the hits for a given beam have the same SNR. I think maybe all hits found thus far are being written out whenever a new hit is detected. There are only 50 unique hits in this file even though there are 1613 entries in this file.

lacker commented 1 year ago

Yeah you're right, it does look like it outputs a single hit many times. The loop is for each beam, for each coarse channel, find hits in that (beam, coarse channel) pair, then (buggily) emit all hits from that coarse channel. And then the hits from the old channels will have the wrong reference data, so their data like their SNR will be screwed up.

The problem is here:

https://github.com/lacker/seticore/blob/master/beamforming_pipeline.cpp#L243

that should only be outputting the hits from the current beam, not the hits from previous beams.

lacker commented 1 year ago

I think this should be fixed in version 1.0.6 with https://github.com/lacker/seticore/commit/328bd9002048cc3cf58bc9a549d6149fb29eadee - it seems like the existing integration tests don't catch this behavior one way or the other.

lacker commented 1 year ago

Just deployed 1.0.6 to Meerkat. Let's check once the pipeline runs some more whether this issue still appears to be present.

david-macmahon commented 1 year ago

Great! I'll keep an eye out.

On a related note, it might be nice to add seticore version info to the hits file, but would that necessitate adding a seticoreVersion field to the Hit structure and therefore include the same info redundantly many times (to be self-redundant :P)?

lacker commented 1 year ago

Yeah I should just add that in, an extra ten bytes per hit won't make any difference because the size is dominated by the data field.

david-macmahon commented 1 year ago

The new version does not output duplicates, thanks!