chhylp123 / hifiasm

Hifiasm: a haplotype-resolved assembler for accurate Hifi reads
MIT License
524 stars 86 forks source link

High memory usage #121

Closed GuillaumeHolley closed 3 years ago

GuillaumeHolley commented 3 years ago

Hi,

I know that Hifiasm is designed for Pacbio HiFi reads but I am doing an experiment in which I am trying to assemble Illumina-corrected ONT reads with Hifiasm v0.15. The reads were de novo corrected with the upcoming version of Ratatosk and I am confident the corrected reads do not mix haplotypes overall. The dataset is about 20 million reads, ~80x coverage total but "only" ~40x aligned coverage. The aligning reads have an N50 of ~50kb and an error rate of ~1.3 to 1.6%. I have used Hifiasm on corrected ONT reads in the past to assemble small chromosomal regions (couple Mbp) and it has worked fine. I am trying to assembled the full dataset with 48 threads on a machine with 350 GB of RAM but I am running out of memory. Log file follows but it seems that Hifiasm takes already all the 350GB of RAM to perform the first round of correction Also, during that round of correction, I get the error message about 130,000 times. It has happened before when assembling smaller regions but it still had an assembly in output.

I was wondering if there is anything I can do to lower the memory usage to assemble the dataset or if it is doomed to fail because of the nature of my reads?

Thank you for the help. Guillaume

[M::ha_analyze_count] lowest: count[10] = 7758025
[M::ha_analyze_count] highest: count[48] = 82160043
[M::ha_hist_line]     2: ****************************************************************************************************> 442141597
[M::ha_hist_line]     3: ************************************************************************************************** 80156702
[M::ha_hist_line]     4: ********************************************** 38006910
[M::ha_hist_line]     5: *************************** 22440590
[M::ha_hist_line]     6: ******************* 15224089
[M::ha_hist_line]     7: ************** 11422276
[M::ha_hist_line]     8: *********** 9317194
[M::ha_hist_line]     9: ********** 8208961
[M::ha_hist_line]    10: ********* 7758025
[M::ha_hist_line]    11: ********** 7823334
[M::ha_hist_line]    12: ********** 8348090
[M::ha_hist_line]    13: *********** 9333895
[M::ha_hist_line]    14: ************* 10732261
[M::ha_hist_line]    15: *************** 12594731
[M::ha_hist_line]    16: ****************** 14769535
[M::ha_hist_line]    17: ********************* 17132273
[M::ha_hist_line]    18: ************************ 19364130
[M::ha_hist_line]    19: ************************** 21372384
[M::ha_hist_line]    20: **************************** 23130193
[M::ha_hist_line]    21: ****************************** 24355852
[M::ha_hist_line]    22: ****************************** 24815153
[M::ha_hist_line]    23: ****************************** 24623511
[M::ha_hist_line]    24: ***************************** 23909720
[M::ha_hist_line]    25: **************************** 22752240
[M::ha_hist_line]    26: ************************** 21371608
[M::ha_hist_line]    27: ************************ 19791470
[M::ha_hist_line]    28: *********************** 18516910
[M::ha_hist_line]    29: ********************* 17384828
[M::ha_hist_line]    30: ******************** 16831954
[M::ha_hist_line]    31: ********************* 16894589
[M::ha_hist_line]    32: ********************** 17704163
[M::ha_hist_line]    33: *********************** 19241655
[M::ha_hist_line]    34: ************************** 21623513
[M::ha_hist_line]    35: ****************************** 24887341
[M::ha_hist_line]    36: *********************************** 28881535
[M::ha_hist_line]    37: ***************************************** 33856561
[M::ha_hist_line]    38: ************************************************ 39207320
[M::ha_hist_line]    39: ******************************************************* 45333561
[M::ha_hist_line]    40: *************************************************************** 51609872
[M::ha_hist_line]    41: ********************************************************************** 57869179
[M::ha_hist_line]    42: ****************************************************************************** 63954457
[M::ha_hist_line]    43: ************************************************************************************* 69430469
[M::ha_hist_line]    44: ******************************************************************************************* 74381172
[M::ha_hist_line]    45: *********************************************************************************************** 78152719
[M::ha_hist_line]    46: *************************************************************************************************** 81067119
[M::ha_hist_line]    47: **************************************************************************************************** 82108165
[M::ha_hist_line]    48: **************************************************************************************************** 82160043
[M::ha_hist_line]    49: *************************************************************************************************** 81052544
[M::ha_hist_line]    50: ************************************************************************************************ 78537744
[M::ha_hist_line]    51: ******************************************************************************************* 75048693
[M::ha_hist_line]    52: ************************************************************************************** 70454695
[M::ha_hist_line]    53: ******************************************************************************** 65362501
[M::ha_hist_line]    54: ************************************************************************* 59612417
[M::ha_hist_line]    55: ***************************************************************** 53606710
[M::ha_hist_line]    56: ********************************************************** 47806487
[M::ha_hist_line]    57: *************************************************** 41822986
[M::ha_hist_line]    58: ******************************************** 35902279
[M::ha_hist_line]    59: ************************************* 30361396
[M::ha_hist_line]    60: ******************************* 25557265
[M::ha_hist_line]    61: ************************** 21100838
[M::ha_hist_line]    62: ********************* 17242155
[M::ha_hist_line]    63: ***************** 13870592
[M::ha_hist_line]    64: ************* 11084399
[M::ha_hist_line]    65: *********** 8686506
[M::ha_hist_line]    66: ******** 6770822
[M::ha_hist_line]    67: ****** 5278689
[M::ha_hist_line]    68: ***** 4191248
[M::ha_hist_line]    69: **** 3247093
[M::ha_hist_line]    70: *** 2505646
[M::ha_hist_line]    71: ** 1954897
[M::ha_hist_line]    72: ** 1568003
[M::ha_hist_line]    73: ** 1265900
[M::ha_hist_line]    74: * 1027136
[M::ha_hist_line]    75: * 864613
[M::ha_hist_line]    76: * 745771
[M::ha_hist_line]    77: * 673646
[M::ha_hist_line]    78: * 621267
[M::ha_hist_line]    79: * 567285
[M::ha_hist_line]    80: * 538274
[M::ha_hist_line]    81: * 521044
[M::ha_hist_line]    82: * 508758
[M::ha_hist_line]    83: * 503951
[M::ha_hist_line]    84: * 493845
[M::ha_hist_line]    85: * 492229
[M::ha_hist_line]    86: * 487597
[M::ha_hist_line]    87: * 489866
[M::ha_hist_line]    88: * 481648
[M::ha_hist_line]    89: * 479197
[M::ha_hist_line]    90: * 477572
[M::ha_hist_line]    91: * 467953
[M::ha_hist_line]    92: * 465395
[M::ha_hist_line]    93: * 462348
[M::ha_hist_line]    94: * 459293
[M::ha_hist_line]    95: * 448050
[M::ha_hist_line]    96: * 436995
[M::ha_hist_line]    97: * 427589
[M::ha_hist_line]    98: * 413564
[M::ha_hist_line]  rest: ********************** 17693446
[M::ha_analyze_count] left: count[22] = 24815153
[M::ha_analyze_count] right: none
[M::ha_ft_gen] peak_hom: 48; peak_het: 22
[M::ha_ft_gen::3556.272*8.40@49.724GB] ==> filtered out 3612836 k-mers occurring 240 or more times
[M::ha_opt_update_cov] updated max_n_chain to 240
[M::ha_pt_gen::4413.320*9.12] ==> counted 807761348 distinct minimizer k-mers
[M::ha_pt_gen] count[4095] = 0 (for sanity check)
[M::ha_analyze_count] lowest: count[12] = 574511
[M::ha_analyze_count] highest: count[47] = 3122369
[M::ha_hist_line]     1: ****************************************************************************************************> 696266294
[M::ha_hist_line]     2: ****************************************************************************************************> 15035820
[M::ha_hist_line]     3: ****************************************************************************************************> 5292700
[M::ha_hist_line]     4: **************************************************************************************** 2744763
[M::ha_hist_line]     5: ******************************************************* 1715010
[M::ha_hist_line]     6: *************************************** 1207643
[M::ha_hist_line]     7: ****************************** 921633
[M::ha_hist_line]     8: ************************ 754063
[M::ha_hist_line]     9: ********************* 654668
[M::ha_hist_line]    10: ******************* 598305
[M::ha_hist_line]    11: ****************** 575239
[M::ha_hist_line]    12: ****************** 574511
[M::ha_hist_line]    13: ******************* 601082
[M::ha_hist_line]    14: ********************* 647873
[M::ha_hist_line]    15: *********************** 715560
[M::ha_hist_line]    16: ************************** 796727
[M::ha_hist_line]    17: **************************** 884211
[M::ha_hist_line]    18: ******************************* 968393
[M::ha_hist_line]    19: ********************************* 1042052
[M::ha_hist_line]    20: *********************************** 1106246
[M::ha_hist_line]    21: ************************************* 1144028
[M::ha_hist_line]    22: ************************************* 1152352
[M::ha_hist_line]    23: ************************************ 1135127
[M::ha_hist_line]    24: *********************************** 1096361
[M::ha_hist_line]    25: ********************************* 1040224
[M::ha_hist_line]    26: ******************************* 978436
[M::ha_hist_line]    27: ***************************** 907215
[M::ha_hist_line]    28: *************************** 850550
[M::ha_hist_line]    29: ************************** 801017
[M::ha_hist_line]    30: ************************* 774461
[M::ha_hist_line]    31: ************************* 772795
[M::ha_hist_line]    32: ************************** 801037
[M::ha_hist_line]    33: *************************** 856028
[M::ha_hist_line]    34: ****************************** 946857
[M::ha_hist_line]    35: ********************************** 1069260
[M::ha_hist_line]    36: *************************************** 1219913
[M::ha_hist_line]    37: ********************************************* 1406447
[M::ha_hist_line]    38: *************************************************** 1604739
[M::ha_hist_line]    39: *********************************************************** 1832608
[M::ha_hist_line]    40: ****************************************************************** 2065179
[M::ha_hist_line]    41: ************************************************************************* 2290559
[M::ha_hist_line]    42: ******************************************************************************** 2511279
[M::ha_hist_line]    43: *************************************************************************************** 2706673
[M::ha_hist_line]    44: ******************************************************************************************** 2879949
[M::ha_hist_line]    45: ************************************************************************************************ 3006942
[M::ha_hist_line]    46: *************************************************************************************************** 3099169
[M::ha_hist_line]    47: **************************************************************************************************** 3122369
[M::ha_hist_line]    48: **************************************************************************************************** 3112289
[M::ha_hist_line]    49: ************************************************************************************************** 3056652
[M::ha_hist_line]    50: ********************************************************************************************** 2948595
[M::ha_hist_line]    51: ****************************************************************************************** 2805119
[M::ha_hist_line]    52: ************************************************************************************ 2625221
[M::ha_hist_line]    53: ****************************************************************************** 2426268
[M::ha_hist_line]    54: *********************************************************************** 2207895
[M::ha_hist_line]    55: *************************************************************** 1979054
[M::ha_hist_line]    56: ******************************************************** 1760375
[M::ha_hist_line]    57: ************************************************* 1532550
[M::ha_hist_line]    58: ****************************************** 1314339
[M::ha_hist_line]    59: ************************************ 1108510
[M::ha_hist_line]    60: ****************************** 933955
[M::ha_hist_line]    61: ************************* 768810
[M::ha_hist_line]    62: ******************** 626999
[M::ha_hist_line]    63: **************** 505772
[M::ha_hist_line]    64: ************* 404811
[M::ha_hist_line]    65: ********** 318735
[M::ha_hist_line]    66: ******** 249356
[M::ha_hist_line]    67: ****** 195918
[M::ha_hist_line]    68: ***** 156310
[M::ha_hist_line]    69: **** 123430
[M::ha_hist_line]    70: *** 96081
[M::ha_hist_line]    71: ** 76614
[M::ha_hist_line]    72: ** 62777
[M::ha_hist_line]    73: ** 51548
[M::ha_hist_line]    74: * 43163
[M::ha_hist_line]    75: * 37158
[M::ha_hist_line]    76: * 32988
[M::ha_hist_line]    77: * 29994
[M::ha_hist_line]    78: * 27721
[M::ha_hist_line]    79: * 25533
[M::ha_hist_line]    80: * 24690
[M::ha_hist_line]    81: * 23699
[M::ha_hist_line]    82: * 23204
[M::ha_hist_line]    83: * 22863
[M::ha_hist_line]    84: * 22156
[M::ha_hist_line]    85: * 22183
[M::ha_hist_line]    86: * 21635
[M::ha_hist_line]    87: * 21326
[M::ha_hist_line]    88: * 20970
[M::ha_hist_line]    89: * 20572
[M::ha_hist_line]    90: * 20327
[M::ha_hist_line]    91: * 20057
[M::ha_hist_line]    92: * 19767
[M::ha_hist_line]    93: * 19278
[M::ha_hist_line]    94: * 19117
[M::ha_hist_line]    95: * 18368
[M::ha_hist_line]    96: * 17825
[M::ha_hist_line]    97: * 17454
[M::ha_hist_line]    98: * 16829
[M::ha_hist_line]    99: * 16501
[M::ha_hist_line]   100: * 16169
[M::ha_hist_line]  rest: ***************** 517451
[M::ha_analyze_count] left: count[22] = 1152352
[M::ha_analyze_count] right: none
[M::ha_pt_gen] peak_hom: 47; peak_het: 22
error
error
error
["error" message occurring about 130,000 times]
error
error
error
[M::ha_pt_gen::4879.190*10.43] ==> indexed 3621762853 positions
[M::ha_assemble::14405.740*34.89@352.250GB] ==> corrected reads for round 1
[M::ha_assemble] # bases: 161234597598; # corrected bases: 124934095; # recorrected bases: 711121
[M::ha_assemble] size of buffer: 250.149GB
[M::ha_pt_gen::14765.116*34.67] ==> counted 792374848 distinct minimizer k-mers
[M::ha_pt_gen] count[4095] = 0 (for sanity check)
[M::ha_analyze_count] lowest: count[12] = 504012
[M::ha_analyze_count] highest: count[48] = 3231068
[M::ha_hist_line]     1: ****************************************************************************************************> 679187975
[M::ha_hist_line]     2: ****************************************************************************************************> 15593354
[M::ha_hist_line]     3: ****************************************************************************************************> 5627651
[M::ha_hist_line]     4: ****************************************************************************************** 2913119
[M::ha_hist_line]     5: ******************************************************** 1800787
[M::ha_hist_line]     6: ************************************** 1242709
[M::ha_hist_line]     7: ***************************** 929609
[M::ha_hist_line]     8: *********************** 740586
[M::ha_hist_line]     9: ******************* 625320
[M::ha_hist_line]    10: ***************** 553248
[M::ha_hist_line]    11: **************** 517242
[M::ha_hist_line]    12: **************** 504012
[M::ha_hist_line]    13: **************** 516937
[M::ha_hist_line]    14: ***************** 552243
[M::ha_hist_line]    15: ******************* 608360
[M::ha_hist_line]    16: ********************* 678682
[M::ha_hist_line]    17: ************************ 766900
[M::ha_hist_line]    18: ************************** 850273
[M::ha_hist_line]    19: ***************************** 933702
[M::ha_hist_line]    20: ******************************* 1008290
[M::ha_hist_line]    21: ********************************* 1059176
[M::ha_hist_line]    22: ********************************** 1088181
[M::ha_hist_line]    23: ********************************** 1089030
[M::ha_hist_line]    24: ********************************* 1062391
[M::ha_hist_line]    25: ******************************** 1021730
[M::ha_hist_line]    26: ****************************** 963330
[M::ha_hist_line]    27: **************************** 898670
[M::ha_hist_line]    28: ************************** 840643
[M::ha_hist_line]    29: ************************ 784636
[M::ha_hist_line]    30: *********************** 750053
[M::ha_hist_line]    31: *********************** 738070
[M::ha_hist_line]    32: *********************** 755759
[M::ha_hist_line]    33: ************************* 799970
[M::ha_hist_line]    34: *************************** 880617
[M::ha_hist_line]    35: ******************************* 997133
[M::ha_hist_line]    36: *********************************** 1145531
[M::ha_hist_line]    37: ***************************************** 1332160
[M::ha_hist_line]    38: ************************************************ 1535536
[M::ha_hist_line]    39: ******************************************************* 1774237
[M::ha_hist_line]    40: ************************************************************** 2018704
[M::ha_hist_line]    41: ********************************************************************** 2259500
[M::ha_hist_line]    42: ***************************************************************************** 2498880
[M::ha_hist_line]    43: ************************************************************************************ 2718200
[M::ha_hist_line]    44: ****************************************************************************************** 2913380
[M::ha_hist_line]    45: *********************************************************************************************** 3063580
[M::ha_hist_line]    46: ************************************************************************************************** 3179126
[M::ha_hist_line]    47: **************************************************************************************************** 3223538
[M::ha_hist_line]    48: **************************************************************************************************** 3231068
[M::ha_hist_line]    49: *************************************************************************************************** 3187041
[M::ha_hist_line]    50: ************************************************************************************************ 3093030
[M::ha_hist_line]    51: ******************************************************************************************** 2957741
[M::ha_hist_line]    52: ************************************************************************************** 2776581
[M::ha_hist_line]    53: ******************************************************************************** 2577055
[M::ha_hist_line]    54: ************************************************************************* 2356653
[M::ha_hist_line]    55: ****************************************************************** 2122725
[M::ha_hist_line]    56: *********************************************************** 1895337
[M::ha_hist_line]    57: *************************************************** 1658196
[M::ha_hist_line]    58: ******************************************** 1426066
[M::ha_hist_line]    59: ************************************* 1207286
[M::ha_hist_line]    60: ******************************** 1020593
[M::ha_hist_line]    61: ************************** 841679
[M::ha_hist_line]    62: ********************* 688328
[M::ha_hist_line]    63: ***************** 556689
[M::ha_hist_line]    64: ************** 445758
[M::ha_hist_line]    65: *********** 351147
[M::ha_hist_line]    66: ********* 275693
[M::ha_hist_line]    67: ******* 215499
[M::ha_hist_line]    68: ***** 172209
[M::ha_hist_line]    69: **** 134938
[M::ha_hist_line]    70: *** 105603
[M::ha_hist_line]    71: *** 84126
[M::ha_hist_line]    72: ** 68056
[M::ha_hist_line]    73: ** 56437
[M::ha_hist_line]    74: * 47199
[M::ha_hist_line]    75: * 40346
[M::ha_hist_line]    76: * 35326
[M::ha_hist_line]    77: * 32530
[M::ha_hist_line]    78: * 30103
[M::ha_hist_line]    79: * 28060
[M::ha_hist_line]    80: * 26322
[M::ha_hist_line]    81: * 25728
[M::ha_hist_line]    82: * 24825
[M::ha_hist_line]    83: * 24432
[M::ha_hist_line]    84: * 24453
[M::ha_hist_line]    85: * 23770
[M::ha_hist_line]    86: * 23670
[M::ha_hist_line]    87: * 23738
[M::ha_hist_line]    88: * 23381
[M::ha_hist_line]    89: * 23144
[M::ha_hist_line]    90: * 22941
[M::ha_hist_line]    91: * 22549
[M::ha_hist_line]    92: * 22532
[M::ha_hist_line]    93: * 22018
[M::ha_hist_line]    94: * 21585
[M::ha_hist_line]    95: * 21551
[M::ha_hist_line]    96: * 20542
[M::ha_hist_line]    97: * 20266
[M::ha_hist_line]    98: * 19845
[M::ha_hist_line]    99: * 19159
[M::ha_hist_line]   100: * 18641
[M::ha_hist_line]   101: * 17838
[M::ha_hist_line]   102: * 17341
[M::ha_hist_line]   103: * 16782
[M::ha_hist_line]   104: * 16189
[M::ha_hist_line]  rest: ****************** 569748
[M::ha_analyze_count] left: count[23] = 1089030
[M::ha_analyze_count] right: none
[M::ha_pt_gen] peak_hom: 48; peak_het: 23
chhylp123 commented 3 years ago

Just curious: which type of error message did you get?

GuillaumeHolley commented 3 years ago

Just error, nothing else.

chhylp123 commented 3 years ago

Could you please show the whole log file? Thanks a lot.

GuillaumeHolley commented 3 years ago

Sure but it is exactly the same as the log above with the additional 130000 error messages. hifiasm.log

chhylp123 commented 3 years ago

Thanks. Do the corrected reads contain Ns?

GuillaumeHolley commented 3 years ago

They do yes. They also contain other IUPAC characters (such as R, Y, M, etc.) at some read positions to represent the possible bases of what is believed to be a SNP when the base couldn't be corrected with enough certainty.

chhylp123 commented 3 years ago

Although hifiasm should support non-ATCG characters, we haven't tested it carefully since there are no Ns in HiFi reads. Could you please have a try to first random replace non-ATCG characters of corrected reads?

GuillaumeHolley commented 3 years ago

I'll try and let you know asap. Thanks.

GuillaumeHolley commented 3 years ago

Hifiasm was killed again by SLURM because it exceeded the 350 GB of RAM I had allocated on the machine. I ran the same command as before on the same dataset with the exception that all non-{A,C,G,T} characters in the FASTQ records have been replaced with a random character in {A,C,G,T}. From the log (below), it seems Hifiasm did not even finish the first round of correction this time.

[M::ha_analyze_count] lowest: count[11] = 7744672
[M::ha_analyze_count] highest: count[48] = 82574295
[M::ha_hist_line]     2: ****************************************************************************************************> 525831591
[M::ha_hist_line]     3: ****************************************************************************************************> 98586033
[M::ha_hist_line]     4: ******************************************************* 45309672
[M::ha_hist_line]     5: ******************************* 25937672
[M::ha_hist_line]     6: ********************* 17094273
[M::ha_hist_line]     7: *************** 12505807
[M::ha_hist_line]     8: ************ 9928293
[M::ha_hist_line]     9: ********** 8512044
[M::ha_hist_line]    10: ********** 7847142
[M::ha_hist_line]    11: ********* 7744672
[M::ha_hist_line]    12: ********** 8110639
[M::ha_hist_line]    13: *********** 8963010
[M::ha_hist_line]    14: ************ 10272782
[M::ha_hist_line]    15: *************** 12032314
[M::ha_hist_line]    16: ***************** 14170362
[M::ha_hist_line]    17: ******************** 16515820
[M::ha_hist_line]    18: *********************** 18777650
[M::ha_hist_line]    19: ************************* 20912610
[M::ha_hist_line]    20: **************************** 22753857
[M::ha_hist_line]    21: ***************************** 24096074
[M::ha_hist_line]    22: ****************************** 24696654
[M::ha_hist_line]    23: ****************************** 24635983
[M::ha_hist_line]    24: ***************************** 24005578
[M::ha_hist_line]    25: **************************** 22916024
[M::ha_hist_line]    26: ************************** 21543870
[M::ha_hist_line]    27: ************************ 20003266
[M::ha_hist_line]    28: *********************** 18696172
[M::ha_hist_line]    29: ********************* 17517752
[M::ha_hist_line]    30: ******************** 16916949
[M::ha_hist_line]    31: ******************** 16906348
[M::ha_hist_line]    32: ********************* 17651708
[M::ha_hist_line]    33: *********************** 19150408
[M::ha_hist_line]    34: ************************** 21485641
[M::ha_hist_line]    35: ****************************** 24729581
[M::ha_hist_line]    36: *********************************** 28720902
[M::ha_hist_line]    37: ***************************************** 33712431
[M::ha_hist_line]    38: *********************************************** 39111564
[M::ha_hist_line]    39: ******************************************************* 45256505
[M::ha_hist_line]    40: ************************************************************** 51606675
[M::ha_hist_line]    41: ********************************************************************** 57920983
[M::ha_hist_line]    42: ****************************************************************************** 64072532
[M::ha_hist_line]    43: ************************************************************************************ 69601704
[M::ha_hist_line]    44: ****************************************************************************************** 74639232
[M::ha_hist_line]    45: *********************************************************************************************** 78475200
[M::ha_hist_line]    46: *************************************************************************************************** 81434859
[M::ha_hist_line]    47: **************************************************************************************************** 82495904
[M::ha_hist_line]    48: **************************************************************************************************** 82574295
[M::ha_hist_line]    49: *************************************************************************************************** 81497494
[M::ha_hist_line]    50: ************************************************************************************************ 78983528
[M::ha_hist_line]    51: ******************************************************************************************* 75491507
[M::ha_hist_line]    52: ************************************************************************************** 70856273
[M::ha_hist_line]    53: ******************************************************************************** 65754188
[M::ha_hist_line]    54: ************************************************************************* 59970792
[M::ha_hist_line]    55: ***************************************************************** 53942106
[M::ha_hist_line]    56: ********************************************************** 48118946
[M::ha_hist_line]    57: *************************************************** 42089855
[M::ha_hist_line]    58: ******************************************** 36134978
[M::ha_hist_line]    59: ************************************* 30564067
[M::ha_hist_line]    60: ******************************* 25716209
[M::ha_hist_line]    61: ************************** 21234157
[M::ha_hist_line]    62: ********************* 17357108
[M::ha_hist_line]    63: ***************** 13964536
[M::ha_hist_line]    64: ************** 11156835
[M::ha_hist_line]    65: *********** 8741363
[M::ha_hist_line]    66: ******** 6817123
[M::ha_hist_line]    67: ****** 5312997
[M::ha_hist_line]    68: ***** 4216606
[M::ha_hist_line]    69: **** 3262675
[M::ha_hist_line]    70: *** 2519520
[M::ha_hist_line]    71: ** 1964355
[M::ha_hist_line]    72: ** 1574208
[M::ha_hist_line]    73: ** 1277102
[M::ha_hist_line]    74: * 1038392
[M::ha_hist_line]    75: * 875781
[M::ha_hist_line]    76: * 756140
[M::ha_hist_line]    77: * 680092
[M::ha_hist_line]    78: * 624944
[M::ha_hist_line]    79: * 574378
[M::ha_hist_line]    80: * 546606
[M::ha_hist_line]    81: * 530547
[M::ha_hist_line]    82: * 515809
[M::ha_hist_line]    83: * 514652
[M::ha_hist_line]    84: * 506343
[M::ha_hist_line]    85: * 505083
[M::ha_hist_line]    86: * 499401
[M::ha_hist_line]    87: * 500704
[M::ha_hist_line]    88: * 492805
[M::ha_hist_line]    89: * 491172
[M::ha_hist_line]    90: * 490893
[M::ha_hist_line]    91: * 483662
[M::ha_hist_line]    92: * 479582
[M::ha_hist_line]    93: * 476058
[M::ha_hist_line]    94: * 474387
[M::ha_hist_line]    95: * 463877
[M::ha_hist_line]    96: * 451986
[M::ha_hist_line]    97: * 437201
[M::ha_hist_line]    98: * 427374
[M::ha_hist_line]    99: * 419575
[M::ha_hist_line]  rest: ********************** 17975295
[M::ha_analyze_count] left: count[22] = 24696654
[M::ha_analyze_count] right: none
[M::ha_ft_gen] peak_hom: 48; peak_het: 22
[M::ha_ft_gen::3572.163*8.46@49.663GB] ==> filtered out 3802165 k-mers occurring 240 or more times
[M::ha_opt_update_cov] updated max_n_chain to 240
[M::ha_pt_gen::6056.646*6.83] ==> counted 837887753 distinct minimizer k-mers
[M::ha_pt_gen] count[4095] = 0 (for sanity check)
[M::ha_analyze_count] lowest: count[12] = 555135
[M::ha_analyze_count] highest: count[47] = 3172259
[M::ha_hist_line]     1: ****************************************************************************************************> 720574074
[M::ha_hist_line]     2: ****************************************************************************************************> 18232359
[M::ha_hist_line]     3: ****************************************************************************************************> 6389010
[M::ha_hist_line]     4: ****************************************************************************************************> 3248017
[M::ha_hist_line]     5: *************************************************************** 1983272
[M::ha_hist_line]     6: ******************************************* 1359737
[M::ha_hist_line]     7: ******************************** 1013378
[M::ha_hist_line]     8: ************************* 805900
[M::ha_hist_line]     9: ********************* 680820
[M::ha_hist_line]    10: ******************* 605644
[M::ha_hist_line]    11: ****************** 566931
[M::ha_hist_line]    12: ***************** 555135
[M::ha_hist_line]    13: ****************** 571208
[M::ha_hist_line]    14: ******************* 611662
[M::ha_hist_line]    15: ********************* 673500
[M::ha_hist_line]    16: ************************ 752605
[M::ha_hist_line]    17: ************************** 839978
[M::ha_hist_line]    18: ***************************** 927115
[M::ha_hist_line]    19: ******************************** 1008450
[M::ha_hist_line]    20: ********************************** 1076761
[M::ha_hist_line]    21: *********************************** 1123232
[M::ha_hist_line]    22: ************************************ 1140327
[M::ha_hist_line]    23: ************************************ 1127548
[M::ha_hist_line]    24: *********************************** 1094856
[M::ha_hist_line]    25: ********************************* 1041454
[M::ha_hist_line]    26: ******************************* 977862
[M::ha_hist_line]    27: ***************************** 905876
[M::ha_hist_line]    28: *************************** 845695
[M::ha_hist_line]    29: ************************* 791784
[M::ha_hist_line]    30: ************************ 760002
[M::ha_hist_line]    31: ************************ 751380
[M::ha_hist_line]    32: ************************ 774771
[M::ha_hist_line]    33: ************************** 826029
[M::ha_hist_line]    34: ***************************** 914077
[M::ha_hist_line]    35: ********************************* 1036138
[M::ha_hist_line]    36: ************************************* 1185911
[M::ha_hist_line]    37: ******************************************* 1376094
[M::ha_hist_line]    38: ************************************************** 1579936
[M::ha_hist_line]    39: ********************************************************* 1813056
[M::ha_hist_line]    40: ***************************************************************** 2052085
[M::ha_hist_line]    41: ************************************************************************ 2286665
[M::ha_hist_line]    42: ******************************************************************************* 2515801
[M::ha_hist_line]    43: ************************************************************************************** 2721922
[M::ha_hist_line]    44: ******************************************************************************************** 2904246
[M::ha_hist_line]    45: ************************************************************************************************ 3039956
[M::ha_hist_line]    46: *************************************************************************************************** 3143469
[M::ha_hist_line]    47: **************************************************************************************************** 3172259
[M::ha_hist_line]    48: **************************************************************************************************** 3166429
[M::ha_hist_line]    49: ************************************************************************************************** 3116789
[M::ha_hist_line]    50: *********************************************************************************************** 3010920
[M::ha_hist_line]    51: ****************************************************************************************** 2869612
[M::ha_hist_line]    52: ************************************************************************************* 2687658
[M::ha_hist_line]    53: ****************************************************************************** 2488337
[M::ha_hist_line]    54: *********************************************************************** 2265873
[M::ha_hist_line]    55: **************************************************************** 2034707
[M::ha_hist_line]    56: ********************************************************* 1811504
[M::ha_hist_line]    57: ************************************************** 1579882
[M::ha_hist_line]    58: ******************************************* 1355393
[M::ha_hist_line]    59: ************************************ 1143865
[M::ha_hist_line]    60: ****************************** 963999
[M::ha_hist_line]    61: ************************* 793906
[M::ha_hist_line]    62: ******************** 648352
[M::ha_hist_line]    63: **************** 523415
[M::ha_hist_line]    64: ************* 418849
[M::ha_hist_line]    65: ********** 329048
[M::ha_hist_line]    66: ******** 257995
[M::ha_hist_line]    67: ****** 203208
[M::ha_hist_line]    68: ***** 161818
[M::ha_hist_line]    69: **** 127277
[M::ha_hist_line]    70: *** 99414
[M::ha_hist_line]    71: ** 79032
[M::ha_hist_line]    72: ** 64909
[M::ha_hist_line]    73: ** 53734
[M::ha_hist_line]    74: * 45327
[M::ha_hist_line]    75: * 38661
[M::ha_hist_line]    76: * 33905
[M::ha_hist_line]    77: * 31268
[M::ha_hist_line]    78: * 29156
[M::ha_hist_line]    79: * 26881
[M::ha_hist_line]    80: * 25597
[M::ha_hist_line]    81: * 25046
[M::ha_hist_line]    82: * 24372
[M::ha_hist_line]    83: * 24032
[M::ha_hist_line]    84: * 23556
[M::ha_hist_line]    85: * 23058
[M::ha_hist_line]    86: * 22933
[M::ha_hist_line]    87: * 22597
[M::ha_hist_line]    88: * 22443
[M::ha_hist_line]    89: * 22281
[M::ha_hist_line]    90: * 21841
[M::ha_hist_line]    91: * 21504
[M::ha_hist_line]    92: * 21471
[M::ha_hist_line]    93: * 20904
[M::ha_hist_line]    94: * 20760
[M::ha_hist_line]    95: * 20165
[M::ha_hist_line]    96: * 19381
[M::ha_hist_line]    97: * 19186
[M::ha_hist_line]    98: * 18593
[M::ha_hist_line]    99: * 18170
[M::ha_hist_line]   100: * 17752
[M::ha_hist_line]   101: * 17047
[M::ha_hist_line]   102: * 16309
[M::ha_hist_line]  rest: ****************** 559645
[M::ha_analyze_count] left: count[22] = 1140327
[M::ha_analyze_count] right: none
[M::ha_pt_gen] peak_hom: 47; peak_het: 22
[M::ha_pt_gen::6525.420*8.01] ==> indexed 3688045147 positions
chhylp123 commented 3 years ago

I have no idea... Have you tested it on a small chromosome like chr22?

GuillaumeHolley commented 3 years ago

I just did. I had ~122,000 reads overlapping chr22 from which I replaced non-DNA characters with random DNA ones. Using 24 threads, hifiasm completed the assembly in just 14min (wall clock time) using 40GB of RAM. Primary assembly has 86 contigs (longest is 553kb) and no alternate assembly was produced.

hifiasm.chr22.log

chhylp123 commented 3 years ago

It seems the N50 is not good, I wonder there are some other issues which affect both running time and contiguity.

GuillaumeHolley commented 3 years ago

Sorry for my late reply.

Some regions are sometimes too complex and Ratatosk doesn't correct them. So it's not uncommon in my corrected reads to see stretches of uncorrected regions which can be a couple hundred bp long. Error rate in such regions is like 8-10%. Could that be the cause for the memory issues?

chhylp123 commented 3 years ago

Yean, I guess so, but haven't tested it carefully.

lh3 commented 3 years ago

Are these ultra-long reads? Hifiasm may use lots of memory in that case. Overall, hifiasm doesn't work well with Nanopore data.

GuillaumeHolley commented 3 years ago

Those are not ultra-long ONT but the N50 is around 50kb though. Anyway, thanks for having a look into it @chhylp123 and @lh3. I thought that because the error rate of the corrected reads was on the low side, I had a shot with hifiasm. Maybe one day :) Don't hesitate to ping me if you happened to find out what is happening with the memory. Closing the issue for now.

Guillaume