nanoporetech / taiyaki

Training models for basecalling Oxford Nanopore reads
https://nanoporetech.com/
Other
115 stars 42 forks source link

R10 training models #77

Closed nbeckloff closed 3 years ago

nbeckloff commented 4 years ago

Can a user train methylation models with Taiyaki using R10 data? Alternatively, do they need the Guppy methlyation mods to basecall the data they will need to train Taiyaki?

tmassingham-ont commented 4 years ago

Hello. A modification-aware model is not needed for training, but you do need the following:

The data preparation step of training (prepare_mapped_reads.py) treats modified bases as their cannonical equivalent, then the modification markup is used during training.

We don't currently distribute a R10 model for mapping with Taiyaki, I shall rectify this over-sight.

nbeckloff commented 4 years ago

Hi Tim, Thanks for the reply. A few follow-up questions if you can spare the time:

  1. What file format would a user need for the reference sequence and model you mentioned below.
  2. Do you have a potential timeline for the R10 mapping model?

Thanks in advance, Nick

From: Tim Massingham notifications@github.com Sent: Thursday, April 23, 2020 5:38 AM To: nanoporetech/taiyaki taiyaki@noreply.github.com Cc: Nicholas Beckloff Nicholas.Beckloff@nanoporetech.com; Author author@noreply.github.com Subject: Re: [nanoporetech/taiyaki] R10 training models (#77)

Hello. A modification-aware model is not needed for training, but you do need the following:

The data preparation step of training (prepare_mapped_reads.py) treats modified bases as their cannonical equivalent, then the modification markup is used during training.

We don't currently distribute a R10 model for mapping with Taiyaki, I shall rectify this over-sight.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/nanoporetech/taiyaki/issues/77#issuecomment-618296270, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALM4ZYV2UIAFKSDDWYHTRT3ROAEAPANCNFSM4MOKEZMQ.

IMPORTANT NOTICE: The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, re-transmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. Although we routinely screen for viruses, addressees should check this e-mail and any attachment for viruses. We make no warranty as to absence of viruses in this e-mail or any attachments. Registered Office: Oxford Science Park, Oxford OX4 4DQ; Registered No. 05386273 ; VAT No GB 336 9423 82. Think about the environment - Do you need to print this email?

cjfields commented 4 years ago

Hi, we have a few projects that users want methylation base calling using R10 data and are also interested in this. Any time frame for this?