Closed mesarcik closed 1 year ago
######### ---
Model | Outlier | F1-Score |
---|---|---|
Resnet18 | oscillating_tile | 0.0228623685413808 |
Resnet18 | real_high_noise | 0.4137931034482758 |
Resnet18 | electric_fence | 0.0291704649042844 |
Resnet18 | data_loss | 0.1576480613549211 |
Resnet18 | lightning | 0.1273278475530532 |
Resnet18 | strong_radio_emitter | 0.5069060773480663 |
Resnet18 | solar_storm | 0.0696428571428571 |
Class | Occurrence after subsampling | # Samples |
---|---|---|
oscillating_tile | 0.01 | 4 |
real_high_noise | 0.13 | 52 |
electric_fence | 0.01 | 4 |
data_loss | 0.06 | 24 |
lightning | 0.05 | 20 |
strong_radio_emitter | 0.2 | 80 |
solar_storm | 0.02 | 8 |
normal | 0.52 | 400 |
Class | oscillating_tile | high_noise | fence | data_loss | lightning | radio_source | solar | normal | |
---|---|---|---|---|---|---|---|---|---|
oscillating_tile | - | 2 | 0 | 2 | 0 | 0 | 0 | 0 | 0 |
high_noise | 1 | - | 3 | 9 | 14 | 15 | 1 | 150 | |
fence | 1 | 7 | - | 0 | 5 | 0 | 0 | 11 | |
data_loss | 8 | 0 | 0 | - | 1 | 2 | 0 | 9 | |
lightning | 0 | 3 | 1 | 0 | - | 0 | 0 | 0 | |
radio_source | 0 | 20 | 0 | 11 | 9 | - | 0 | 23 | |
solar | 0 | 0 | 0 | 0 | 1 | 0 | - | 0 | |
normal | 89 | 5 | 5 | 2 | 38 | 0 | 0 | - |
From this two things are very clear: 1) High Noise elements, is poorly labelled, with the VAST majority of misclassifications coming from the normal class 2) The RFI in the normal class is being classified as an oscillating tile
Interestingly as a follow up experiment, I removed the high_noise_element class from the SSL detector and obtained perfect detection!
Additionally, I ran some experiments with the supervised method when excluding the high noise class and the normal detection results increase by like 30%
Model | AUPRC | F1 |
---|---|---|
Supervised (no high noise) | 0.9477973202901427 | 0.93354943273906 |
SSL (no high noise) | 0.9693446711123157 | 0.9236276849642006 |
Supervised (high noise) | 0.9479283607170697 | 0.9023529411764706 |
SSL (high noise) | 0.9446762864463896 | 0.9135408353285449 |
** Note: I evalute with != normal class, i expected it to be the same as == normal_class, but its not.
Seems that our unsupervised detector doesnt do as well as expected.
[x] Refractor training with new dataloader structure.
Below you can see what happens when we use the unsupervised mask during the multi-class classification problem.
Generally speaking, the unsupervised detector misclassifies around 70 normal samples and 70 anomalous samples (about 5% error)
This means that when we use it to mask the classifiers' about we get a few percent decrease in detection per class
The question is, does the unsupervised method make the system more useful?
High noise element:
1st order: high SNR feature
3rd order: low SNR
Strong radio source:
Data loss
Class | # samples | osc | 1 noise | 3 noise | 1 data | 3 data | lightning | galactic | source in | solar storm | normal | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
oscillating tile | 9 | - | 0 | 3 | 2 | 0 | 0 | 2 | 2 | 0 | 0 | 0 |
1st high noise | 24 | 1 | - | 9 | 0 | 1 | 6 | 0 | 2 | 1 | 4 | |
3rd high noise | 24 | 1 | 2 | - | 1 | 0 | 8 | 0 | 2 | 1 | 9 | |
1st data loss | 49 | 0 | 0 | 0 | - | 49 | 0 | 0 | 0 | 0 | 0 | |
3rd data loss | 24 | 0 | 0 | 6 | 4 | - | 0 | 0 | 0 | 0 | 14 | |
lightning | 24 | 0 | 12 | 9 | 0 | 0 | - | 0 | 0 | 0 | 3 | |
galactic plane | 34 | 0 | 0 | 1 | 1 | 0 | 0 | - | 12 | 0 | 20 | |
source in sidelobes | 49 | 4 | 0 | 0 | 0 | 27 | 0 | 8 | - | 0 | 10 | |
solar storm | 49 | 0 | 0 | 46 | 0 | 0 | 3 | 0 | 0 | - | 0 |
Model | Normal F1 | Anomalous F1 |
---|---|---|
Supervised | 0.91885 +- 0.01669 | 0.85912 +- 0.01717 |
SSL | 0.93704 +- 0 | 0.88096 +- 0 |
Note: F1 score has now become f2 score as it is more sensitive to the anomalies (precision)
OOD Class name | Supervised | Supervised + SSL | Supervised + SSL (random Resnet weights) | Supervised + dists |
---|---|---|---|---|
oscillating_tile | 0.272 | 0.643 | 0.384 | 0.531 |
first_order_high_noise | 0.58 | 0.417 | 0.434 | 0.49 |
third_order_high_noise | 0.473 | 0.567 | 0.447 | 0.212 |
first_order_data_loss | 0.103 | 0.73 | 0.145 | 0.932 |
third_order_data_loss | 0.105 | 0.236 | 0.127 | 0.235 |
lightning | 0.49 | 0.868 | 0.251 | 0.653 |
galactic_plane | 0.491 | 0.592 | 0.363 | 0.325 |
source_in_sidelobes | 0.724 | 0.699 | 0.634 | 0.417 |
solar_storm | 0.837 | 0.895 | 0.448 | 0.774 |
Class | # Samples (exclusive) | # Samples (inclusive) |
---|---|---|
oscillating_tile | 50 | 61 |
first_order_high_noise | 65 | 75 |
third_order_high_noise | 111 | 153 |
first_order_data_loss | 168 | 169 |
third_order_data_loss | 209 | 342 |
lightning | 323 | 402 |
galactic_plane | 249 | 590 |
source_in_sidelobes | 165 | 456 |
solar_storm | 147 | 147 |
normal | 7413 | n/a |
third order high noise
Description
We have made progress in data set creation, self-supervised anomaly detection and supervised anomaly detection. However several issues need to be addressed before this work is ready for publication.
Model fine tuning
Things to try:
Outcome:
Finish results for URSI abstract:
Label the last few examples in the dataset:
Supervised model: