This document serves as a plan for the final publication in my PhD. This work is a continuation of the novelty-based detection scheme proposed in the last two publication. Here we discuss some of the foreseeable difficulties in both constructing the dataset as well as training models for anomaly detection. The following section are in order of completion, i.e. creating the dataset needs to happen before training the models.
Dataset creation
We need to translate the spreadsheet-based labels to spectrograms that can be used for training the anomaly detection based models. I began this process late last year, but discovered a bug in the station names extraction code. The following todo lists serves as a guide of the steps needed to
TODO
[x] Pull Jorrit's patch
[x] Separate the anomalies from autocorrelations and cross correlations
Maybe it sufficient to detect anomalies in the autocorrelation spectrum because then we can assume all baselines that contain that station will also contain the anomaly.
[x] Ensure that there is only 1 type of anomaly per class.
[x] There should be only 1 training set for all the anomalies, this training set should contain no anomalies
Should the training set only consist of autocorrelations? This means we have no phase information?
[x] To detect only one specific anomaly we would need to sample anomalies from the other dataset. The scheme I propose is shown below.
One of the problems with this approach is that if not all other known anomalies are contained within the training set, then we may detect the unseen anomaly as the class we are trying to detect
For example, if we are training an oscillating tile detector and used only the two classes shown in the diagram below, then if the model were to be exposed to lightening then it may say it is an oscilating tile?
Other considerations
Focus on 4 anomalies initially
High noise element
Data loss
Oscillating tile
Scintilation
then expand to the more rare classes that may require higher resolution data
Unlike the RFI work, the labels only need to be on a per-spectrogram level
There will be multiple testing classes but 1 training class (that contains only "normal" LOFAR data).
Multi class anomaly detection methodology for LOFAR
Train excluding a single class and all others, evaluate how well it detects that class
This means to detect 4 different anomalies we need 4 different models that are trained on completely different data (similar how we evaluated on MNIST in the NLN paper)
This may also give an opportunity for efficient computing as the models are run in parallel it could offer some interesting implementation details
Model selection
Evaluate NLN on this dataset
Limitations of the method, we predict a scalar for a given input spectrogram, using AE's for this means that we need to average the Reconstruction error which will probably cause poor sensitivity
I think singular mapping using self-supervised or contrastive losses will be best
Investigate alternative self-supervised losses that produce a scalar per spectrogram without averaging the reconstruction error
We have extra information about the baseline (get a model to predict which baseline an input is from)
We can include another subtask?
Do we use a contrastive loss? Are the spectrograms sufficiently different to do so? Maybe if we include a contrastive loss per baseline?
Get the model to fill in the blanks like the paper we referenced for our NLN paper?
Other thoughts
Timeline? When do i need to start writing my thesis?
Potential venues, short conference paper at SPIE, journal paper for MNRAS?
Simulated HERA data without RFI and tried to formulate a subtask-based learning task
I used a resnet52 and trained it from scratch
Initially tried to predict the baseline pair for the 7 element simulated array shown below (i.e. the string of "ant 7, ant 9")
However due to the symmetry in the array the model never learnt much
So as a way to resolve the symmetry I trained on distance and the model learnt something, (got 95% accuracy on test data)
Below is the projection using TSNE
its clear that there are 3 clusters of spectrograms, these correspond to the 3 distances 0 (for autocorrelations), 14.6, and 29.2
within these clusters there should also be some distribution of features, but I have not tested this yet.
To-do
create an evaluation method to see if these models are better suited to anomaly detection than the standard AE?
I think this can be done within HERA like we did in the first paper, create the individual labels per spectrogram and remove one from the training set to see how well the model can detect anomalies.
Background
This document serves as a plan for the final publication in my PhD. This work is a continuation of the novelty-based detection scheme proposed in the last two publication. Here we discuss some of the foreseeable difficulties in both constructing the dataset as well as training models for anomaly detection. The following section are in order of completion, i.e. creating the dataset needs to happen before training the models.
Dataset creation
We need to translate the spreadsheet-based labels to spectrograms that can be used for training the anomaly detection based models. I began this process late last year, but discovered a bug in the station names extraction code. The following todo lists serves as a guide of the steps needed to
TODO
Other considerations
Focus on 4 anomalies initially
Unlike the RFI work, the labels only need to be on a per-spectrogram level
There will be multiple testing classes but 1 training class (that contains only "normal" LOFAR data).
Structure
Training procedure
Model selection
Other thoughts