nanoporetech / bonito

A PyTorch Basecaller for Oxford Nanopore Reads
https://nanoporetech.com/
Other
389 stars 120 forks source link

Tuning pre-trained bonito model #131

Open ktan8 opened 3 years ago

ktan8 commented 3 years ago

We've been applying Bonito to some of our PromethION datasets. However, we noticed that the model seems to be basecalling some repeat elements incorrectly. To address this, we figured that providing the model with more training examples of these repeat sequences might help base call these sequences correctly.

However, we understand that re-training the model from scratch might take a bit of time. Is there a way for us retrain the pre-trained Bonito models with our extra training examples?

Thanks!

iiSeymour commented 3 years ago

Good question @ktan8, this is possible. Just use --pretrained to specify a model to fine tune and lower the learning rate.

bonito training --epochs 1 --lr 5e-4 --pretrained dna_r9.4.1@v3.2 --directory new-training-data fine-tuned-model
ktan8 commented 3 years ago

Thanks!