nanoporetech / dorado

Oxford Nanopore's Basecaller
https://nanoporetech.com/
Other
446 stars 54 forks source link

Unknown modification variant: '6mA/gpfs2' - Choices: 6mA, 5mC, m6A_DRACH, 5mCG_5hmCG, 5mCG, 5mC_5hmC #767

Closed habibsaky closed 2 months ago

habibsaky commented 2 months ago

Issue Report

Please describe the issue:

Please provide a clear and concise description of the issue you are seeing and the result you expect.

Steps to reproduce the issue:

Please list any steps to reproduce the issue.

Run environment:

Logs

habibsaky commented 2 months ago

Can anyone heplu me that I can run this code-- (base) [mmolla@dg-gpunode03 ~]$ /gpfs1/home/m/m/mmolla/dorado-0.6.0-linux-x64/bin/dorado basecaller hac,5mCG_5hmCG,5mC_5hmC,6mA/gpfs2 /scratch/mmolla/Pod/ > /gpfs2/scratch/mmolla/HL3/calls.bam [2024-04-23 17:00:56.188] [info] Running: "basecaller" "hac,5mCG_5hmCG,5mC_5hmC,6mA/gpfs2" "/scratch/mmolla/Pod/" [2024-04-23 17:00:56.209] [error] Unknown modification variant: '6mA/gpfs2' - Choices: 6mA, 5mC, m6A_DRACH, 5mCG_5hmCG, 5mCG, 5mC_5hmC [2024-04-23 17:00:56.209] [error] Failed to parse model argument. Failed to parse modified model arguments (base) [mmolla@dg-gpunode03 ~]$ /gpfs1/home/m/m/mmolla/dorado-0.6.0-linux-x64/bin/dorado basecaller hac,5mCG_5hmCG,5mC_5hmC /gpfs2/scratch/mmolla/Pod/ > /gpfs2/scratch/mmolla/HL3/3calls.bam [2024-04-23 17:02:19.656] [info] Running: "basecaller" "hac,5mCG_5hmCG,5mC_5hmC" "/gpfs2/scratch/mmolla/Pod/" [2024-04-23 17:02:19.892] [error] Failed to get modification model [2024-04-23 17:02:19.892] [info] Found 1 modification models without mods variant: 5mC_5hmC [2024-04-23 17:02:19.892] [info] - dna_r10.4.1_e8.2_400bps_hac@v4.1.0_5mCG_5hmCG@v2 - mods variant: 5mCG_5hmCG [2024-04-23 17:02:19.893] [error] No matches for chemistry: dna_r10.4.1_e8.2_400bps_4khz, model_variant: hac, version: v4.1.0, mods_variant: 5mC_5hmC (base) [mmolla@dg-gpunode03 ~]$

tijyojwad commented 2 months ago

Hi @habibsaky - it looks like your dataset is a 4KHz sampling rate dataset, which is why dorado picked the v4.1 model (that's the last published model which supports 4KHz data). For that model we only support 5mCG_5hmCG mods.

You can find the list of available models and corresponding mods here - https://github.com/nanoporetech/dorado?tab=readme-ov-file#dna-models

habibsaky commented 2 months ago

Thank you for your support.

HalfPhoton commented 2 months ago

@habibsaky, The issue here is that your command is incorrect:

dorado basecaller hac,5mCG_5hmCG,5mC_5hmC,6mA/gpfs2 /scratch/mmolla/Pod/ > /gpfs2/scratch/mmolla/HL3/calls.bam

As the error suggests 6mA/gpfs2 is not a know modification - you missed a whitespace between 6mA and /gpfs

I think the command you want is

/gpfs1/home/m/m/mmolla/dorado-0.6.0-linux-x64/bin/dorado basecaller hac,5mCG_5hmCG,5mC_5hmC,6mA /gpfs2/scratch/mmolla/Pod/ > /gpfs2/scratch/mmolla/HL3/calls.bam
malton-ont commented 2 months ago

5mCG_5hmCG,5mC_5hmC

Note it is an error to attempt to run multiple modbase models on the same canonical base. Pick either 5mCG_5hmCG or 5mC_5hmC, depending on if you want to restrict calls to only the CG context or not.