transientskp / tkp

A transients-discovery pipeline for astronomical image-based surveys
http://docs.transientskp.org/
BSD 2-Clause "Simplified" License
19 stars 14 forks source link

Quality check not being completed correctly on images #588

Open AntoniaR opened 3 years ago

AntoniaR commented 3 years ago

RMS quality checks that take into account the distribution of image rms values in the dataset have been implemented. i.e. a gaussian function is fitted to the rms distribution obtained from a specified number of images (given by the rms_history option in the job parameters file) https://github.com/transientskp/tkp/blob/fa65950f6a8891e97880e483e03206a06e5f130e/tkp/config/job_template/job_params.cfg#L6 This was to address Issue #512

However, during testing of the implementation of a new parameter to threshold the fitted rms values, it became clear that the no quality checks are conducted until you reach the number of images in the rms_history parameter. This is not correct at all for batch mode and is only partially correct for streaming mode.

Batch mode:

Streaming mode:

This issue requires some redesign in the implementation of the quality checks. Namely, there should be separate versions for batch and streaming for the histogram fitted thresholds. Also, some of the quality checks (the rms_est_max/rms_est_min and beam parameters) should be separated out for all images.

AntoniaR commented 3 years ago

The beam parameters and rms_est_max and rms_est_min checks are now incorporated for all images on branch Issue588 https://github.com/transientskp/tkp/tree/Issue588

The histogram fit is correct for streaming data but need to adapt for batch data. For batch data we need to measure all of the rms values, fit and then reassess all of the images to reject those outside of the allowed range.

Current situation:

Requirement for batch mode:

n.b. the streaming mode is likely to be inefficient and require speed up. I propose that we don't just refit every image, but only refit every n images where n is the rms_est_history number.

AntoniaR commented 3 years ago

To do the Gaussian fit on all the images will require changing the current logic flow in TraP for batch mode. The current set up works fine for streaming mode.

Current setup:

Setup needed for batch mode:

As this is a change in the main logic of TraP, I propose we leave it for now and just ensure that the basic quality control steps are included for all images.

AntoniaR commented 3 years ago

This issue is partially completed by the pull request #591 but there remains an issue with the historical rms in Batch mode.

AntoniaR commented 3 weeks ago

This code is currently very inefficient and not working correctly. This is a very useful option for R7 so we should assess how easy it would be to implement correctly.