Closed andreistoian closed 3 years ago
Hi,
Regarding the accuracy, how is the accuracy in N2D2, before the export (with ./n2d2 onnx.ini -test
)?
I ran it with:
bin/n2d2 ~sound1d-onnx.ini -test -seed 1 -w /dev/null
and I get
Testing #348 38.40%
Final recognition rate: 38.40% (error rate: 61.60%)
Sensitivity: 55.95% / Specificity: 94.37% / Precision: 44.80%
Accuracy: 89.73% / F1-score: 47.24% / Informedness: 50.32%
What is the recognition rate and how is it different from 'Accuracy'?
Hi,
The accuracy problem comes from a bad label mapping of the output of the network. The label mapping should be the following:
/down 0
/go 1
/left 2
/no 3
/off 4
/on 5
/right 6
/silence 7
/stop 8
/unknown 9
/up 10
/yes 11
But in fact, since the /silence
folder is empty, no image with label /silence
is loaded in the database driver and this label is not created (this is the current behaviour of N2D2, which does not create label for empty folder). As a result, the following classes are shifted and mapped to the wrong output.
Regarding the score metrics, some remarks:
Finally, the calibration issue is the same as the one explained in issue #80. We are still thinking about possible solutions in this case that would not cause precision loss.
Actually, I just tested the CPP
export and it works fine! Using the command: ./n2d2 sound1d-onnx.ini -seed 1 -w /dev/null -test -export CPP -calib -1
The average recall is 80% in INT8 vs. 83% before quantization.
No calibration issue here (which should not happen for mono-branch network).
I'm sorry but I'm not able to fully reproduce the working behavior:
To fix the 'silence' class issue I added 60 wavs with silence to the directory.
[database]
Learn=0
Validation=0
Test=1
Depth=1
The pytorch model has 81% accuracy while running with N2D2 -test -seed 1 -w /dev/null
gives
Testing database size: 871 images
Notice: stimuli depth is 64F (according to database first stimulus)
[LOG] Stimuli transformations flow (transformations.png)
[LOG] Network graph (sound1d-onnx.ini.png)
Warning: using box for unknown shape cylinder
[LOG] Network SVG graph (sound1d-onnx.ini.svg)
[LOG] Network stats (stats/*)
[LOG] Solvers scheduling (schedule/*)
[LOG] Layer's receptive fields (receptive_fields.log)
[LOG] Labels mapping (*.Target/labels_mapping.log)
[LOG] Labels legend (*.Target/labels_legend.png)
[LOG] Learn frame samples (frames/frame*)
[LOG] Test frame samples (frames/test_frame*)
[10:17.89 4:7.73 7:7.62 5:2.84 9:1.87 ]
Testing #100 93.07%
Testing #200 94.53%
Testing #300 95.02%
Testing #400 86.03%
Testing #500 81.84%
Testing #600 82.70%
Testing #700 79.46%
Testing #800 76.65%
Testing #870 75.43%
Final recognition rate: 75.43% (error rate: 24.57%)
Sensitivity: 83.67% / Specificity: 97.79% / Precision: 72.44%
Accuracy: 95.91% / F1-score: 75.57% / Informedness: 81.46%
I export the model to float32: models/ONNX/sound1d-onnx.ini -test -seed 1 -export CPP -nbbits -32 -w /dev/null
. When I run 'run_export' (note I need to change the make file to -O0 -g so it does not crash) I get
649.000000/866 (74.942263%)
650.000000/867 (74.971165%)
651.000000/868 (75.000000%)
651.000000/869 (74.913694%)
651.000000/870 (74.827586%)
652.000000/871 (74.856487%)
Score: 74.856487%
[database]
Learn=0
Validation=0.5
Test=0.5
Depth=1
I export to int8 with calibration on the whole validation set. models/ONNX/sound1d-onnx.ini -test -seed 1 -export CPP -calib -1 -w /dev/null
. N2D2 takes 312 stimuli for calibration (I guess Nclasses * min(card(class_i)) ?)
and I get, after a long time:
Notice: stimuli depth is 64F (according to database first stimulus)
Remove Dropout...
Fuse BatchNorm with Conv...
export_CPP_int8/stimuli_stats processing 312 stimuli
Fuse Padding...
Cross-layer equalization:
- eq. 35 and 33
- eq. 37 and 35
quant. range delta = 0.491025
export_CPP_int8/stimuli_stats processing 312 stimuli
Calculating calibration data range and histogram...
Calibration data 100/312
Calibration data 200/312
Calibration data 300/312
Quantization (8 bits)...
Quantizing free parameters:
- 17: 1.57456
- 19: 1.57456
- 20: 1.10175
- 22: 1.10175
- 23: 0.77638
- 25: 0.77638
- 26: 0.445755
- 28: 0.445755
- 29: 0.24595
- 31: 0.24595
- 33: 0.107331
- 35: 0.0528017
- 37: 0.0259759
Fuse scaling cells:
Quantizing activations:
- 17: prev=1, act=605.467, bias=1.57456
quant=63.251, global scaling=384.532 -> cell scaling=4.1115e-05
- 20: prev=384.532, act=885.995, bias=1.10175
quant=127, global scaling=804.168 -> cell scaling=0.00376515
- 23: prev=804.168, act=1939.13, bias=0.77638
quant=127, global scaling=2497.66 -> cell scaling=0.00253519
- 26: prev=2497.66, act=3749.77, bias=0.445755
quant=127, global scaling=8412.17 -> cell scaling=0.00233787
- 29: prev=8412.17, act=2682.97, bias=0.24595
quant=127, global scaling=10908.6 -> cell scaling=0.00607205
- 33: prev=10908.6, act=10430.1, bias=0.107331
quant=127, global scaling=97176.4 -> cell scaling=0.000883903
- 35: prev=97176.4, act=2751.02, bias=0.0528017
quant=127, global scaling=52100.9 -> cell scaling=0.0146863
- 37: prev=52100.9, act=3834.95, bias=0.0259759
quant=255, global scaling=147635 -> cell scaling=0.00138393
Fuse scaling cells:
- fuse: 17_rescale_act
- fuse: 20_rescale_act
- fuse: 23_rescale_act
- fuse: 26_rescale_act
- fuse: 29_rescale_act
- fuse: 33_rescale_act
- fuse: 35_rescale_act
- fuse: 37_rescale_act
Scaling approximation [3]:
- 17: 4.1115e-05
SINGLE_SHIFT: 2 ^ [- 14]
- 20: 0.00376515
SINGLE_SHIFT: 2 ^ [- 8]
- 23: 0.00253519
SINGLE_SHIFT: 2 ^ [- 8]
- 26: 0.00233787
SINGLE_SHIFT: 2 ^ [- 8]
- 29: 0.00607205
SINGLE_SHIFT: 2 ^ [- 7]
- 33: 0.000883903
SINGLE_SHIFT: 2 ^ [- 10]
- 35: 0.0146863
SINGLE_SHIFT: 2 ^ [- 6]
- 37: 0.00138393
SINGLE_SHIFT: 2 ^ [- 9]
Inputs quantization
Done!
..................
[3:0.00 4:0.00 1:0.00 0:0.00 2:0.00 ]
Testing #100 8.91%
Testing #200 7.46%
Testing #300 12.96%
Testing #400 11.22%
Testing #500 10.18%
Testing #558 9.48%
Final recognition rate: 9.48% (error rate: 90.52%)
Sensitivity: 12.37% / Specificity: 91.86% / Precision: 11.20%
Accuracy: 84.91% / F1-score: 7.73% / Informedness: 4.23%
Time elapsed: 17281.58 s
When I compile and run 'run_export' I get
Score: 14.625000%
Please don't forget to delete the export_CPP_int8 folder before running a new export when a change has been made to the dataset partitioning or pre-processing. The problem was due to faulty stimuli in the dataset and bad partitioning compared to PyTorch. Considering the issue solved. Closing.
Hello,
I'm trying to compile a sound classification network using 1D convolution with the CPP export. With pytorch I get 81% accuracy on a small subset of data on which I also test and calibrate the N2D2 export.
assert(scaling <= 1.0);
It seems the scaling has value 248.938 (as do all the 8 elements of mScalingPerOutput of the RectifierActivation class.The input WAV files are floating point values between -1 and 1 (mostly in the -0.05 and 0.05 range), loaded from FLOAT32 WAV files using the code given in #77 .
Here is the code that exports the dataset, computes accuracy for the pytorch model and exports the ONNX. sound_demo.zip