nanoporetech / dorado

Oxford Nanopore's Basecaller
https://nanoporetech.com/
Other
446 stars 54 forks source link

[error] Maximum number of positional arguments exceeded #745

Closed habibsaky closed 2 months ago

habibsaky commented 2 months ago

Issue Report

Please describe the issue:

Please provide a clear and concise description of the issue you are seeing and the result you expect.

Steps to reproduce the issue:

Please list any steps to reproduce the issue.

Run environment:

Logs

habibsaky commented 2 months ago

I am new...anyone can solve this error

(base) [mmolla@node304 ~]$ dorado basecaller /gpfs2/scratch/mmolla/Pod/*.pod5/ dna_r10.4.1_e8.2_400bps_hac@v4.3.0 --modified-bases-models  > /gpfs2/scratch/mmolla/HL3/calls.bam
[2024-04-15 14:00:40.929] [info] Running: "basecaller" "/gpfs2/scratch/mmolla/Pod/*.pod5/" "dna_r10.4.1_e8.2_400bps_hac@v4.3.0" "--modified-bases-models"
[2024-04-15 14:00:40.990] [info] > Creating basecall pipeline
[2024-04-15 14:00:41.000] [error] toml::parse: file open error -> /gpfs2/scratch/mmolla/Pod/*.pod5/config.toml
(base) [mmolla@node304 ~]$ dorado basecaller /gpfs2/scratch/mmolla/Pod/*.pod5 dna_r10.4.1_e8.2_400bps_hac@v4.3.0 --modified-bases-models  > /gpfs2/scratch/mmolla/HL3/calls.bam
[2024-04-15 14:01:12.049] [info] Running: "basecaller" "/gpfs2/scratch/mmolla/Pod/FAV25444_pass_f12a207a_aff71dca_0.pod5" "/gpfs2/scratch/mmolla/Pod/FAV25444_pass_f12a207a_aff71dca_100.pod5" "/gpfs2/scratch/mmolla/Pod/FAV25444_pass_f12a207a_aff71dca_101.pod5" "/gpfs2/scratch/mmolla/Pod/FAV25444_pass_f12a207a_aff71dca_102.p
[error] Maximum number of positional arguments exceeded
iiSeymour commented 2 months ago

@habibsaky you need to pass the basecaller model before the pod5 directory and also provide a the modified bases model you want to call like so:

dorado basecaller dna_r10.4.1_e8.2_400bps_hac@v4.3.0 /gpfs2/scratch/mmolla/Pod/.pod5/  --modified-bases 5mCG_5hmCG > /gpfs2/scratch/mmolla/HL3/calls.bam

or more simply you can do:

dorado basecaller hac,5mCG_5hmCG /gpfs2/scratch/mmolla/Pod/.pod5/ > /gpfs2/scratch/mmolla/HL3/calls.bam

see https://github.com/nanoporetech/dorado?tab=readme-ov-file#automatic-model-selection-complex

habibsaky commented 2 months ago

@habibsaky you need to pass the basecaller model before the pod5 directory and also provide a the modified bases model you want to call like so:

dorado basecaller dna_r10.4.1_e8.2_400bps_hac@v4.3.0 /gpfs2/scratch/mmolla/Pod/.pod5/  --modified-bases 5mCG_5hmCG > /gpfs2/scratch/mmolla/HL3/calls.bam

or more simply you can do:

dorado basecaller hac,5mCG_5hmCG /gpfs2/scratch/mmolla/Pod/.pod5/ > /gpfs2/scratch/mmolla/HL3/calls.bam

see https://github.com/nanoporetech/dorado?tab=readme-ov-file#automatic-model-selection-complex

I run this and again got error----

(base) [mmolla@node300 ~]$ dorado basecaller dna_r10.4.1_e8.2_400bps_hac@v4.3.0 /gpfs2/scratch/mmolla/Pod/*.pod5/  --modified-bases 5mCG_5hmCG > /gpfs2/scratch/mmolla/HL3/calls.bam
[2024-04-15 15:24:43.129] [info] Running: "basecaller" "dna_r10.4.1_e8.2_400bps_hac@v4.3.0" "/gpfs2/scratch/mmolla/Pod/*.pod5/" "--modified-bases" "5mCG_5hmCG"
terminate called after throwing an instance of 'std::runtime_error'
  what():  Cannot find modification model for '5mCG_5hmCG' reason: simplex model doesn't exist at: dna_r10.4.1_e8.2_400bps_hac@v4.3.0
Aborted
(base) [mmolla@node300 ~]$
iiSeymour commented 2 months ago

You need to have downloaded dna_r10.4.1_e8.2_400bps_hac@v4.3.0 and be in the same directory as the download, see https://github.com/nanoporetech/dorado?tab=readme-ov-file#available-basecalling-models

If you have downloaded the model into a different directory then you'll need to provide the full path dorado basecaller /path/to/model/dna_r10.4.1_e8.2_400bps_hac@v4.3.0 ....

Or simply use the model selection complex which will handle this automatically for you:

dorado basecaller hac,5mCG_5hmCG /gpfs2/scratch/mmolla/Pod/.pod5/ > /gpfs2/scratch/mmolla/HL3/calls.bam
habibsaky commented 2 months ago

Now it shows--please see it

[2024-04-16 13:12:16.686] [info] Running: "basecaller" "dna_r10.4.1_e8.2_400bps_hac@v4.3.0" "/gpfs2/scratch/mmolla/Pod/"
[2024-04-16 13:12:16.743] [info] > Creating basecall pipeline
[2024-04-16 13:12:16.826] [error] Sample rate for model (5000) and data (4000) are not compatible.
(base) [mmolla@node302 ~]$
iiSeymour commented 2 months ago

v4.3.0 is a 5kHz model and you have 4kHz reads: see https://github.com/nanoporetech/dorado?tab=readme-ov-file#dna-models

You'll need to use v4.1.0 or better yet let dorado handle it with:

dorado basecaller hac,5mCG_5hmCG /gpfs2/scratch/mmolla/Pod/.pod5/ > /gpfs2/scratch/mmolla/HL3/calls.bam