nanoporetech / dorado

Oxford Nanopore's Basecaller
https://nanoporetech.com/
Other
452 stars 53 forks source link

Preprocessing of raw signal data #715

Closed denisbeslic closed 3 months ago

denisbeslic commented 3 months ago

Hi,

I have a question regarding the preprocessing of the raw signal data. I was wondering if the raw signal data is just scaled or if you perform some form of additional normalization (per read) before model training / inference. The provided k-mer models are given as expected values (normalized), so I wanted to know if the basecalling process was done in a similar manner.

tijyojwad commented 3 months ago

Hi @denisbeslic - the dorado raw signal processing code can be found here - https://github.com/nanoporetech/dorado/blob/release-v0.6.0/dorado/read_pipeline/ScalerNode.cpp#L178