morispi / CONSENT

Scalable long read self-correction and assembly polishing with multiple sequence alignment
https://doi.org/10.1038/s41598-020-80757-5
GNU Affero General Public License v3.0
55 stars 4 forks source link

CONSENT-correct: line 188 #28

Closed jkbenotmane closed 3 years ago

jkbenotmane commented 3 years ago

When correcting a Sample of Sup High Accuracy Basecalled Sequences (2M) I ran into this Problem:

/mnt/d/Dropbox/User/Projects/Sequencing/TCR_Analysis/TCR_Analysis_Tools/CONSENT/CONSENT-correct --in /mnt/d/Dropbox/User/Sequencing_Data/TCR/Samp9/Samp9_sup_basecalled/Sup_Sample/Samp9_sample_merge.fastq --out /mnt/d/Dropbox/User/Projects/Sequencing/Samples/SpaTCR/Samp9/CONSENT/Sup_BC_Sample/Samp9_sample_merge.fasta --type ONT [Thu Jul 8 14:14:19 CEST 2021] Overlapping the long reads (minimap2) /mnt/d/Dropbox/User/Projects/Sequencing/TCR_Analysis/TCR_Analysis_Tools/CONSENT/CONSENT-correct: line 188: 1581 Killed minimap2 -k15 -w5 -m100 -g10000 -r2000 --max-chain-skip 25 --dual=yes -PD --no-long-join -t"$nproc" -I"$minimapMemory" "$reads" "$reads" > $tmpdir/"$alignments" 2> $tmpdir/"$minimapErrlog"

Do I understand correctly that minimap runs into a Ram Problem? Is there any way to limit minimaps Ram consume or to bypass this?

morispi commented 3 years ago

Hello,

It would seem like minimap2 ran into an issue indeed. Can you attach me the minimapErrlog file, and show me the first few lines of the alignments file, if there are any?

Thanks, Pierre

jkbenotmane commented 3 years ago

Hello, thank you for the quick response!

this is the content of the minimap_1575 file [M::mm_idx_gen::20.4761.98] collected minimizers [M::mm_idx_gen::26.9953.02] sorted minimizers [M::main::26.9953.02] loaded/built the index for 1144794 target sequence(s) [M::mm_mapopt_update::27.5322.98] mid_occ = 2893 [M::mm_idx_stat] kmer size: 15; skip: 5; is_hpc: 0; #seq: 1144794 [M::mm_idx_stat::27.898*2.95] distinct minimizers: 32479077 (65.96% are singletons); average occurrences: 11.143; average spacing: 2.763; total length: 1000008845

this is the first 50 lines of the Alignments_1575.paf e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 664 - 949f1232-4306-4c63-863d-ca52193a123c 717 81 632 533 553 0 tp:A:S cm:i:161 s1:i:532 dv:f:0.0120 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 664 - 86dd2ecb-db0a-4716-ad16-078ca9164974 933 258 806 525 553 0 tp:A:S cm:i:156 s1:i:524 dv:f:0.0141 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 116 664 + 66a12aea-a88e-4486-b61d-a4460fd0e5d2 926 122 667 523 550 0 tp:A:S cm:i:154 s1:i:521 dv:f:0.0146 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 123 664 - 7edc659f-dec1-4ce6-946d-261977160f8d 870 206 744 518 544 0 tp:A:S cm:i:155 s1:i:517 dv:f:0.0131 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 664 + 79c48785-9e06-498d-95cc-40ca6be4f83f 2862 1189 1741 508 555 0 tp:A:S cm:i:145 s1:i:507 dv:f:0.0189 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 614 + ccf66b88-7176-4ea1-b889-086cbfff8fbf 685 144 645 486 503 0 tp:A:S cm:i:146 s1:i:486 dv:f:0.0128 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 645 + 7e7595a1-4dc4-4e04-9908-75160613e938 722 140 675 484 538 0 tp:A:S cm:i:132 s1:i:482 dv:f:0.0230 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 664 + 5ddcf135-8f5f-4675-a4a7-14e431ad0f56 1633 127 671 478 552 0 tp:A:S cm:i:129 s1:i:475 dv:f:0.0265 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 654 + 888b641c-50f5-4efc-88df-b439b319806a 680 101 639 462 542 0 tp:A:S cm:i:141 s1:i:461 dv:f:0.0197 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 593 + 6c96ac85-063c-4f04-bb9e-175a184c39e8 1499 980 1457 462 481 0 tp:A:S cm:i:149 s1:i:461 dv:f:0.0091 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 614 + c01ed3ec-aff9-4342-9b50-93b4979247e4 1346 146 639 459 502 0 tp:A:S cm:i:132 s1:i:456 dv:f:0.0194 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 160 664 + c9f0ccea-bbed-4ecc-9e85-18b812926d0e 1491 731 1230 457 505 0 tp:A:S cm:i:133 s1:i:455 dv:f:0.0192 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 634 + 7604a85b-31f9-4b08-b1c5-5b7e9861e9b0 847 152 673 450 525 0 tp:A:S cm:i:130 s1:i:448 dv:f:0.0229 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 530 + 27a84756-7858-4a1d-acbf-91119c407ff9 624 163 580 399 418 0 tp:A:S cm:i:126 s1:i:399 dv:f:0.0111 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 125 664 + f0ac5142-6566-4889-9d93-946087310c35 974 109 637 402 542 0 tp:A:S cm:i:105 s1:i:398 dv:f:0.0381 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 503 + 0da10a69-7a05-4ff4-b5e4-03a3d71d36a6 552 120 509 382 391 0 tp:A:S cm:i:121 s1:i:381 dv:f:0.0087 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 532 - 6105fa78-bb49-47ff-859a-57274e69f261 631 84 504 378 424 0 tp:A:S cm:i:116 s1:i:376 dv:f:0.0170 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 503 - 8d145d23-3da0-40bc-9f95-8a13e6e8a5bf 608 61 452 362 392 0 tp:A:S cm:i:101 s1:i:362 dv:f:0.0206 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 565 + e0cbd547-5fa2-411f-adfc-8b37f7540926 675 160 611 341 460 0 tp:A:S cm:i:83 s1:i:337 dv:f:0.0436 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 530 - 8fe79267-7f31-4356-a55d-87b7306a2357 649 62 476 327 420 0 tp:A:S cm:i:98 s1:i:325 dv:f:0.0275 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 449 + 2accc8d0-a13b-4574-8e5b-9819e28ed49c 494 104 440 323 337 0 tp:A:S cm:i:102 s1:i:323 dv:f:0.0119 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 439 + f4787b58-fcbf-4dc4-9ac9-7504d139a677 538 162 490 323 329 0 tp:A:S cm:i:105 s1:i:322 dv:f:0.0083 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 473 + 4e5b5396-e09f-4372-bc1a-2d09b1f997c4 612 209 563 303 360 0 tp:A:S cm:i:82 s1:i:301 dv:f:0.0298 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 439 + ee01118a-cab2-45e2-b318-befff9606a04 508 164 488 299 328 0 tp:A:S cm:i:82 s1:i:298 dv:f:0.0245 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 460 - 4c8cf545-ac9c-4b08-8a44-1b44d169d4b1 519 71 410 298 348 0 tp:A:S cm:i:81 s1:i:295 dv:f:0.0285 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 408 - 4c324ed4-715f-48d0-a334-e61f98788e86 812 399 693 281 296 0 tp:A:S cm:i:89 s1:i:281 dv:f:0.0134 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 468 + 3dc01864-5b47-477b-be70-3cb7e79aacc6 1434 102 461 273 362 0 tp:A:S cm:i:63 s1:i:270 dv:f:0.0462 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 376 + b32fd230-c5f8-48a1-81b6-19fe023ea161 1247 964 1225 245 263 0 tp:A:S cm:i:79 s1:i:244 dv:f:0.0115 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 123 376 - 315bc570-4639-4e13-a24b-6a5d7dbd30e0 1035 62 311 241 255 0 tp:A:S cm:i:67 s1:i:240 dv:f:0.0195 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 365 + 9d0c7413-7425-46ee-8b4e-a2d0ffff5893 1129 151 400 218 255 0 tp:A:S cm:i:43 s1:i:216 dv:f:0.0473 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 116 337 - eece825e-9664-4777-8ec0-4c057dd0fd1c 440 55 275 194 223 0 tp:A:S cm:i:52 s1:i:193 dv:f:0.0275 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 309 - 510e31b9-b341-4452-ae01-debb9558f859 356 57 250 191 197 0 tp:A:S cm:i:57 s1:i:190 dv:f:0.0136 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 113 335 + 5d9acf04-123e-4ac7-888e-25389358dcfa 411 156 377 185 223 0 tp:A:S cm:i:53 s1:i:184 dv:f:0.0263 rl:i:118 e9aab3a9-cf5b-4817-b306-b20b0ae71729 703 116 337 + cc0a8b4e-0b78-4093-9a24-8b84c018489d 934 105 325 184 223 0 tp:A:S cm:i:46 s1:i:183 dv:f:0.0354 rl:i:118 4466ecd7-ffa1-44f6-bc67-0b6f6127d64d 1108 69 1074 - a30a58b6-c3d6-4bc6-a4fa-4090a082faca 1095 56 1068 195 1012 0 tp:A:S cm:i:35 s1:i:194 dv:f:0.0768 rl:i:898 4466ecd7-ffa1-44f6-bc67-0b6f6127d64d 1108 71 1074 + 19f7f4ef-2a42-440f-87fd-61c4912ccad4 1118 78 1085 193 1008 0 tp:A:S cm:i:37 s1:i:192 dv:f:0.0728 rl:i:898 4466ecd7-ffa1-44f6-bc67-0b6f6127d64d 1108 69 1074 - 8c2d7c6d-c521-4329-aa7e-cc216620dcd6 1103 52 1054 192 1015 0 tp:A:S cm:i:38 s1:i:186 dv:f:0.0717 rl:i:898 4466ecd7-ffa1-44f6-bc67-0b6f6127d64d 1108 69 1074 + 30a220d2-37f6-486e-9163-76d0ad31df00 1101 68 1067 187 1008 0 tp:A:S cm:i:33 s1:i:185 dv:f:0.0804 rl:i:898 4466ecd7-ffa1-44f6-bc67-0b6f6127d64d 1108 69 1074 - f2067a54-af68-45a5-a7ed-78a668e6b200 1107 53 1060 187 1014 0 tp:A:S cm:i:31 s1:i:184 dv:f:0.0842 rl:i:898 4466ecd7-ffa1-44f6-bc67-0b6f6127d64d 1108 69 1074 + 959a1bd9-1df7-42be-88b7-f0459a6de002 1980 961 1966 189 1013 0 tp:A:S cm:i:36 s1:i:184 dv:f:0.0745 rl:i:898 4466ecd7-ffa1-44f6-bc67-0b6f6127d64d 1108 69 1074 - b7a6ef7e-acac-44e3-9ee3-d01cdcc87444 1075 55 1029 190 1007 0 tp:A:S cm:i:34 s1:i:183 dv:f:0.0786 rl:i:898 4466ecd7-ffa1-44f6-bc67-0b6f6127d64d 1108 69 1074 + df417d68-98ac-459d-ad20-883d88f51cbf 1101 63 1071 187 1013 0 tp:A:S cm:i:42 s1:i:183 dv:f:0.0655 rl:i:898 4466ecd7-ffa1-44f6-bc67-0b6f6127d64d 1108 69 1070 - ab68bd25-4b3c-44c8-87f9-2d4591b9278a 1100 60 1055 185 1007 0 tp:A:S cm:i:36 s1:i:181 dv:f:0.0745 rl:i:898 4466ecd7-ffa1-44f6-bc67-0b6f6127d64d 1108 69 1074 + f020e88e-bffb-4306-b64c-628d4f18c3e2 1112 65 1077 184 1018 0 tp:A:S cm:i:38 s1:i:179 dv:f:0.0717 rl:i:898 4466ecd7-ffa1-44f6-bc67-0b6f6127d64d 1108 71 1074 - 26726fbe-eb9a-411d-9baa-03ce44213395 1097 54 1050 185 1014 0 tp:A:S cm:i:29 s1:i:178 dv:f:0.0878 rl:i:898 4466ecd7-ffa1-44f6-bc67-0b6f6127d64d 1108 69 1074 - 80ced9f8-62c8-416e-bfac-b33c76221054 1109 51 1063 179 1014 0 tp:A:S cm:i:34 s1:i:177 dv:f:0.0786 rl:i:898 4466ecd7-ffa1-44f6-bc67-0b6f6127d64d 1108 71 1074 - 13e37d29-db92-4568-af8b-c3518662ddc3 1100 53 1053 179 1009 0 tp:A:S cm:i:33 s1:i:176 dv:f:0.0799 rl:i:898 4466ecd7-ffa1-44f6-bc67-0b6f6127d64d 1108 69 1074 - 26399466-c49b-4cb4-bd84-022586490eac 1919 52 1039 182 1010 0 tp:A:S cm:i:37 s1:i:176 dv:f:0.0733 rl:i:898 4466ecd7-ffa1-44f6-bc67-0b6f6127d64d 1108 69 1074 - 2746c0c0-af26-4a58-be57-2f9963ec920d 1700 49 1043 180 1010 0 tp:A:S cm:i:34 s1:i:174 dv:f:0.0786 rl:i:898 4466ecd7-ffa1-44f6-bc67-0b6f6127d64d 1108 161 1074 + 433e1552-388e-4164-8a3c-559348da6ac0 1078 125 1042 175 919 0 tp:A:S cm:i:35 s1:i:174 dv:f:0.0752 rl:i:898

Thank you! Jasim

morispi commented 3 years ago

Okay, some alignments were reported. However, I don't see any error in the error log of minimap2. :(

Maybe you could try to reduce the --minimapIndex parameter of CONSENT. It is set at 500M by default, maybe you could try to set it to 250M, and see how it goes?

Also, maybe you could try to run minimap2 with default parameters on your reads?

jkbenotmane commented 3 years ago

Thank you, reducing the Index to 250M worked.

At least I guess it did, because Mindmap wrote a humongous .paf from 42gb fastq.

But it crashed unexpectedly because it was unable to write the file as it reached around 2tb (though there was still storage remaining).

Do you think chunking the fastq in simple round Robin way would make it work?

morispi commented 3 years ago

Hi,

Sorry for not coming back to you earlier.

Minimap usually generates pretty huge paf files when using CONSENT parameters, yes, this is normal behaviour. I do believe chunking the fastq could work, but probably affect the results a bit.

I see you closed the issue so I guess you ended up with a satisfying fix?

Best, Pierre

jkbenotmane commented 3 years ago

Hi Pierre sorry for the late Response.

I could not find a satisfying Solution as the Filesize just exaggerated all Harddrives that I could access and therefore could never Really evaluate the Result of CONSENT.