nanoporetech / dorado

Oxford Nanopore's Basecaller
https://nanoporetech.com/
Other
531 stars 63 forks source link

21839 as methylation type #938

Closed osvatic closed 4 months ago

osvatic commented 4 months ago

Issue Report

Please describe the issue:

When using dorado to obtain methylated basecall info, the standard "a" and "c" methylation types but so does "21839".

What is this methylation call?

This "21839" shows up in multiple different genomes.

Below are the commands that were used to obtain the methylation information.

Run environment:

dorado v0.7.2 samtools v1.20 modkit v0.3.1rc1

dorado basecaller sup,4mC_5mC,6mA reads/10N.222.45.E8_N03 --kit-name SQK-RBK110-96 --min-qscore 15 --emit-moves --reference 10N.222.45.E8_N03.fa > modcalls_v7/10N.222.45.E8_N03.bam

samtools sort --write-index -@ 8 -O BAM -o modcalls_v7/10N.222.45.E8_N03.sorted.bam modcalls_v7/10N.222.45.E8_N03.bam

modkit pileup -t 8 --only-tabs modcalls_v7/10N.222.45.E8_N03.sorted.bam modcalls_v7/10N.222.45.E8_N03.pileup.bed

Logs

dorado log: dorado_2205889.txt bed file (500 line subset): 10N.222.45.E8_N03.subset.pileup.bed.txt

malton-ont commented 4 months ago

Hi @osvatic,

It looks like you are running the 6mA and 4mC_5mC modification models. 21839 is the ChEBI code for 4mC. https://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI:21839