hab-spc / hab-ml

Harmful algae bloom CNN detection model, developed in PyTorch
7 stars 1 forks source link

Train 20 Class HAB-Model #39

Closed ktl014 closed 5 years ago

ktl014 commented 5 years ago

Story:

As a data annotator/biologist/Product Owner, I want to annotate predicted data from a model to lighten the workload of having to annotate the entire dataset. Therefore, this will be about first training the model.

Definition of Done:

Story closure acceptance will be determined by @ktl014

ktl014 commented 5 years ago

Evaluation Report

Stats summary txt should contain the training/val set size and performance metrics.

└── HAB20_Model
    ├── loss.png
    ├── confusion.png
    └── stats_summary.txt

Table example

Model batch_size max_seq_length learning_rate Epochs training_time Accuracy (test) F1 (test)
char+BiLSTM+CRF 6 - 1E-03 15 57 min 97.52% 77.80%
BERT-Large 6 128 2E-05 4 40 min 98.31% 81.23%
BERT-Large 6 128 5E-05 4 39 min 98.34% 80.97%
BERT-Large 2 256 2E-05 4 108 min 98.20% 80.33%
BERT-Large 3 128 2E-05 4 58 min 98.30% 81.91%
BERT-Large 3 256 2E-05 4 82 min 98.27% 81.58%
BERT-Large 6 128 2E-05 3 55 mins 98.26% 81.52%