simondrue commented 2 months ago

Hi Nanopore team,

I noticed a significant (~10x) slow down of the Dorado basecaller when adding the 6mA modification model. This is notable since adding the 5mCG_5hmCG modification model have almost no impact on basecalling speed.

Is this expected behavior? If so, are there any plans to optimize the speed of the 6mA model?

Results from a small benchmark of basecalling speeds:

5mCG_5hmCG@v1 + 6mA@v2: 2.875147e+06
5mCG_5hmCG@v1: 1.816510e+07
6mA@v2: 1.598966e+06
6mA@v1: 1.628905e+06
No mods: 1.869901e+07

Thanks for a great tool. Looking forward to see where the project is going 🚀

Run environment:

Dorado version: v0.6.0
Dorado command: basecaller
Operating system: Linux
Hardware:
- NVIDIA V100 16GB
- Intel/“Skylake” Gold 6140 CPU @ 2.30GHz, 18 cores/CPU
- Nvidia V100 16Gb GPU
- Source data type (e.g., pod5 or fast5 - please note we always recommend converting to pod5 for optimal basecalling performance):

Logs

5mCG_5hmCG@v1 + 6mA@v2

[2024-04-10 13:40:56.130] [info] Running: "basecaller" "--no-trim" "--modified-bases-models" "/faststorage/project/MomaReference/BACKUP/nanopore/models/dorado_models/dna_r10.4.1_e8.2_400bps_hac@v4.3.0_5mCG_5hmCG@v1,/faststorage/project/MomaReference/BACKUP/nanopore/models/dorado_models/dna_r10.4.1_e8.2_400bps_hac@v4.3.0_6mA@v2" "/faststorage/project/MomaReference/BACKUP/nanopore/models/dorado_models/dna_r10.4.1_e8.2_400bps_hac@v4.3.0" "/dev/shm/pod5s/"
[2024-04-10 13:40:56.292] [info] > Creating basecall pipeline
[2024-04-10 13:41:42.105] [info] cuda:0 using chunk size 9996, batch size 2304
[2024-04-10 13:41:43.233] [info] cuda:0 using chunk size 4998, batch size 3328
[2024-04-10 13:46:59.681] [info] > Simplex reads basecalled: 19997
[2024-04-10 13:46:59.681] [info] > Simplex reads filtered: 3
[2024-04-10 13:46:59.681] [info] > Basecalled @ Samples/s: 2.875147e+06
[2024-04-10 13:46:59.694] [info] > Finished

5mCG_5hmCG@v1 only

[2024-04-10 13:22:37.532] [info] Running: "basecaller" "--no-trim" "--modified-bases-models" "/faststorage/project/MomaReference/BACKUP/nanopore/models/dorado_models/dna_r10.4.1_e8.2_400bps_hac@v4.3.0_5mCG_5hmCG@v1" "/faststorage/project/MomaReference/BACKUP/nanopore/models/dorado_models/dna_r10.4.1_e8.2_400bps_hac@v4.3.0" "/dev/shm/pod5s/"
[2024-04-10 13:22:37.595] [info] > Creating basecall pipeline
[2024-04-10 13:23:07.830] [info] cuda:0 using chunk size 9996, batch size 2304
[2024-04-10 13:23:09.229] [info] cuda:0 using chunk size 4998, batch size 3328
[2024-04-10 13:24:00.319] [info] > Simplex reads basecalled: 19997
[2024-04-10 13:24:00.319] [info] > Simplex reads filtered: 3
[2024-04-10 13:24:00.319] [info] > Basecalled @ Samples/s: 1.816510e+07
[2024-04-10 13:24:00.326] [info] > Finished

6mA@v2 only

[2024-04-10 13:29:43.608] [info] Running: "basecaller" "--no-trim" "--modified-bases-models" "/faststorage/project/MomaReference/BACKUP/nanopore/models/dorado_models/dna_r10.4.1_e8.2_400bps_hac@v4.3.0_6mA@v2" "/faststorage/project/MomaReference/BACKUP/nanopore/models/dorado_models/dna_r10.4.1_e8.2_400bps_hac@v4.3.0" "/dev/shm/pod5s/"
[2024-04-10 13:29:43.651] [info] > Creating basecall pipeline
[2024-04-10 13:30:13.930] [info] cuda:0 using chunk size 9996, batch size 2304
[2024-04-10 13:30:14.863] [info] cuda:0 using chunk size 4998, batch size 3328
[2024-04-10 13:39:42.926] [info] > Simplex reads basecalled: 19997
[2024-04-10 13:39:42.926] [info] > Simplex reads filtered: 3
[2024-04-10 13:39:42.926] [info] > Basecalled @ Samples/s: 1.598966e+06
[2024-04-10 13:39:42.932] [info] > Finished

6mA@v1 only

[2024-04-10 13:58:07.035] [info] Running: "basecaller" "--no-trim" "--modified-bases-models" "/faststorage/project/MomaReference/BACKUP/nanopore/models/dorado_models/dna_r10.4.1_e8.2_400bps_hac@v4.3.0_6mA@v1" "/faststorage/project/MomaReference/BACKUP/nanopore/models/dorado_models/dna_r10.4.1_e8.2_400bps_hac@v4.3.0" "/dev/shm/pod5s/"
[2024-04-10 13:58:07.096] [info] > Creating basecall pipeline
[2024-04-10 13:58:42.809] [info] cuda:0 using chunk size 9996, batch size 2304
[2024-04-10 13:58:43.992] [info] cuda:0 using chunk size 4998, batch size 3328
[2024-04-10 14:08:01.642] [info] > Simplex reads basecalled: 19997
[2024-04-10 14:08:01.642] [info] > Simplex reads filtered: 3
[2024-04-10 14:08:01.642] [info] > Basecalled @ Samples/s: 1.628905e+06
[2024-04-10 14:08:01.649] [info] > Finished

No mods

[2024-04-10 10:03:54.462] [info] Running: "basecaller" "--no-trim" "/faststorage/project/MomaReference/BACKUP/nanopore/models/dorado_models/dna_r10.4.1_e8.2_400bps_hac@v4.3.0" "/dev/shm/pod5s/"
[2024-04-10 10:03:54.573] [info] > Creating basecall pipeline
[2024-04-10 10:04:28.276] [info] cuda:0 using chunk size 9996, batch size 2304
[2024-04-10 10:04:29.366] [info] cuda:0 using chunk size 4998, batch size 3264
[2024-04-10 10:05:19.109] [info] > Simplex reads basecalled: 19997
[2024-04-10 10:05:19.112] [info] > Simplex reads filtered: 3
[2024-04-10 10:05:19.115] [info] > Basecalled @ Samples/s: 1.869901e+07
[2024-04-10 10:05:19.121] [info] > Finished

vellamike commented 2 months ago

Hi @simondrue - your benchmark showing that 5mCG_5hmCG@v1 + 6mA@v2 is faster than 6mA@v2 only is a bit surprising - could you repeat this a few times and verify that your benchmarks are not noisy?

ymcki commented 2 months ago

I created a small dataset with four pod5s that are about 2GB each and run on 4xA100 with 4.3.0 sup model.

5mCG_5hmCG only: 3m49.244s 5mC_5hmC only: 6m21.693s 6mA only: 9m33.237s 5mCG_5hmCG + 6mA: 9m49.589s 5mC_5hmC + 6mA: 11m13,113s

My times seem quite normal. Is this within expectation?

simondrue commented 1 month ago

Hi,

I expanded my benchmark and used Dorado v0.7.0 with the new v5 models for both HAC and SUP, all available modifications (one at a time - no combinations) and 5 replicates with --max-reads 150000. The system is the same as stated above and the data is from a cfDNA sample.

I still see the significant slowdown for 6mA model, even compared to the other all context models. Just to verify that there is not an enriched amount of A the composition of the sample is:

A: 10.143.093 bases (27.38%)
T: 11.088.922 bases (29.94%)
C: 7.874.313 bases (21.26%)
G: 7.937.969 bases (21.42%)

The data behind the plots: speed_data.csv

Sorry for the late reply

/Simon

nanoporetech / dorado

Significant slowdown when adding mA modification model #737