rrwick / Deepbinner

a signal-level demultiplexer for Oxford Nanopore reads
GNU General Public License v3.0
124 stars 23 forks source link

min_signal_length when creating training data #32

Open aroelo opened 4 years ago

aroelo commented 4 years ago

Hi, I am in the process of creating a training set for deepbinner and noticed that a lot of my reads are excluded, because the signal length is too short.

When using the porechop command (deepbinner porechop porechop.out /path/to/fast5_dir > raw_training_data) 455782 of the 650483 reads are skipped for being too short.

I see that the default value is set at 20000 and am thinking about lowering this so I don't lose that many reads, but am unsure how this would influence the performance of deepbinner.

Is it a strict requirement for the signal length to be that long? I couldn't find more information about this parameter in the documentation.