Confusion about labeled samples from the paper

haotianteng / Chiron

A basecaller for Oxford Nanopore Technologies' sequencers

Other

122 stars 53 forks source link

Hello, I'm confused about training from the paper. You were clear on how you partitioned the input signal data (by sliding windows of length 300 with step sizes of 30) but it was not clear how you partitioned labelings of these signals for outputs.

Can you elaborate on how you gave each length 300 signal segment a label for training? Do you kmer expand the base reading in some fashion? It seems the uppercase K used in the paper is never well documented afterwards. I'm also having a hard time finding it in this repo.

haotianteng / Chiron

Confusion about labeled samples from the paper #88