nodrogluap / OpenDBA

GPU-accelerated Dynamic Time Warp (DTW) Barycenter Averaging
Other
64 stars 13 forks source link

Nanopore Kmer Model and Consensus Sequence Generation with openDBA #21

Open N-damo opened 6 months ago

N-damo commented 6 months ago

Dear developers: openDBA has many applications, which are very interesting. Nanopore provides a kmer model for generating standard nanopore signals. By providing a sequence based on this kmer model, the corresponding standard nanopore signals can be generated. Traditional kmer models require sequencing a standard sequence, known as de Bruijn sequence, which includes all possible kmers. Initially, the current values of kmers are manually annotated and then iteratively corrected. Whether openDBA can generate a consensus sequence to obtain this kmer model without manual annotation? best regards Li'anLin

nodrogluap commented 6 months ago

Hi,

Yes, it would be quite simple to build a k-mer model from the DTW path files generated by OpenDBA. There are already scripts in the repo for finding bimodally distributed bases in a consensus signal for example. You would need to have some physical deBruijn synthetic RNA run through an experiment to capture this. For a sliding window of 5 RNA bases, this would need to be a 1024 base synthetic construct. In practice, synthetic RNA oligo generation longer than 120nt is difficult and expensive, so you would probably want to create 10 overlapping 120mer RNAs, then allow for 10 clusters in OpenDBA. For an absolutely minimal experiment this would cost more than USD24,000 in custom RNA oligos. An alternative would be to design a custom In-Vitro Transcription system, which would require considerably more benchwork.

Cheers,

Paul