mesarcik commented 2 years ago

Background

This document serves as a plan for the final publication in my PhD. This work is a continuation of the novelty-based detection scheme proposed in the last two publication. Here we discuss some of the foreseeable difficulties in both constructing the dataset as well as training models for anomaly detection. The following section are in order of completion, i.e. creating the dataset needs to happen before training the models.

Dataset creation

We need to translate the spreadsheet-based labels to spectrograms that can be used for training the anomaly detection based models. I began this process late last year, but discovered a bug in the station names extraction code. The following todo lists serves as a guide of the steps needed to

TODO

[x] Pull Jorrit's patch
[x] Separate the anomalies from autocorrelations and cross correlations
- Maybe it sufficient to detect anomalies in the autocorrelation spectrum because then we can assume all baselines that contain that station will also contain the anomaly.
[x] Ensure that there is only 1 type of anomaly per class.
[x] There should be only 1 training set for all the anomalies, this training set should contain no anomalies
- Should the training set only consist of autocorrelations? This means we have no phase information?
[x] To detect only one specific anomaly we would need to sample anomalies from the other dataset. The scheme I propose is shown below.
- One of the problems with this approach is that if not all other known anomalies are contained within the training set, then we may detect the unseen anomaly as the class we are trying to detect
- For example, if we are training an oscillating tile detector and used only the two classes shown in the diagram below, then if the model were to be exposed to lightening then it may say it is an oscilating tile?

Other considerations

Focus on 4 anomalies initially
- High noise element
- Data loss
- Oscillating tile
- Scintilation
- then expand to the more rare classes that may require higher resolution data
Unlike the RFI work, the labels only need to be on a per-spectrogram level
There will be multiple testing classes but 1 training class (that contains only "normal" LOFAR data).

Structure

LOFARAD/

|-- train
|   |-- crosscorrelations
|   |  |   |-- Baseline ID
|   |  |   |-- Baseline ID
|   |  |   |-- ...
|   |-- autocorrelations
|   |  |   |-- Baseline ID
|   |  |   |-- Baseline ID
|   |  |   |-- ...

|-- test
|   |-- high_noise
|   |   |-- crosscorrelations
|   |   |    |-- Baseline ID
|   |   |    |-- Baseline ID
|   |   |    |-- ...
|   |   |-- autocorrelations
|   |   |    |-- Baseline ID
|   |   |    |-- Baseline ID
|   |   |    |-- ...
|   |-- oscillating_tile
|   |   |-- crosscorrelations
|   |   |    |-- Baseline ID
|   |   |    |-- Baseline ID
|   |   |    |-- ...
|   |   |-- autocorrelations
|   |   |  |   |-- Baseline ID
|   |   |  |   |-- Baseline ID
|   |   |  |   |-- ...
|-- <etc etc>

Training procedure

Multi class anomaly detection methodology for LOFAR
- Train excluding a single class and all others, evaluate how well it detects that class
- This means to detect 4 different anomalies we need 4 different models that are trained on completely different data (similar how we evaluated on MNIST in the NLN paper)
- This may also give an opportunity for efficient computing as the models are run in parallel it could offer some interesting implementation details

trianing_diagram

Model selection

Evaluate NLN on this dataset
- Limitations of the method, we predict a scalar for a given input spectrogram, using AE's for this means that we need to average the Reconstruction error which will probably cause poor sensitivity
- I think singular mapping using self-supervised or contrastive losses will be best
Investigate alternative self-supervised losses that produce a scalar per spectrogram without averaging the reconstruction error
- We have extra information about the baseline (get a model to predict which baseline an input is from)
- We can include another subtask?
- Do we use a contrastive loss? Are the spectrograms sufficiently different to do so? Maybe if we include a contrastive loss per baseline?
- Get the model to fill in the blanks like the paper we referenced for our NLN paper?

Other thoughts

Timeline? When do i need to start writing my thesis?
Potential venues, short conference paper at SPIE, journal paper for MNRAS?

mesarcik commented 2 years ago

Some experimental results for SSL

Simulated HERA data without RFI and tried to formulate a subtask-based learning task
I used a resnet52 and trained it from scratch
Initially tried to predict the baseline pair for the 7 element simulated array shown below (i.e. the string of "ant 7, ant 9")
However due to the symmetry in the array the model never learnt much
So as a way to resolve the symmetry I trained on distance and the model learnt something, (got 95% accuracy on test data)
Below is the projection using TSNE
its clear that there are 3 clusters of spectrograms, these correspond to the 3 distances 0 (for autocorrelations), 14.6, and 29.2
within these clusters there should also be some distribution of features, but I have not tested this yet.

temp

To-do

create an evaluation method to see if these models are better suited to anomaly detection than the standard AE?
I think this can be done within HERA like we did in the first paper, create the individual labels per spectrogram and remove one from the training set to see how well the model can detect anomalies.

mesarcik commented 1 year ago

Dataset creation

Edit: I just realised i wasn't normalising the plots properly so i think this problem should be resolved.

There are many inconsistencies with the labels Tom has supplied and the plots I have generated.
I have used the original Adder code to generate the inspection plots (so I consider both amplitude, phase and the multiple polarisations.
Even in this case the labels he supplied often do not correspond to the correct baseline, or even fearture
Initially i thought it was due to the way i was plotting it, however even with using the original plotting library, the issues are not resolved.
I think i will need to manually label everything in anycase.

TODO:

[x] Move all separated folders to a single directory for labeling
[ ] Separate magnitude and phase (only use magnitude for labelling)
[ ] Go through the partially correct list of labels from Thomas and export them to the label-studio format
[ ] Start labeling.

mesarcik commented 1 year ago

Backup

04/08/22

project-37-at-2022-08-04-13-56-09ca81c8.json.zip

09/08/22

project-37-at-2022-08-09-16-04-6be75797.json.zip

10/08/22

project-37-at-2022-08-10-15-44-de7bb51f.json.zip

11/08/22

project-37-at-2022-08-11-15-31-e9001b04.json.zip

12/08/22

project-37-at-2022-08-12-15-54-ac0ab837.json.zip

mesarcik / ROAD

Publication plan #1

Background

Dataset creation

TODO

Other considerations

Structure

Training procedure

Model selection

Other thoughts

Some experimental results for SSL

To-do

Dataset creation

TODO:

Backup

04/08/22

09/08/22

10/08/22

11/08/22

12/08/22