isovic / racon

Ultrafast consensus module for raw de novo genome assembly of long uncorrected reads. http://genome.cshlp.org/content/early/2017/01/18/gr.214270.116 Note: This was the original repository which will no longer be officially maintained. Please use the new official repository here:
https://github.com/lbcb-sci/racon
MIT License
261 stars 48 forks source link

racon generates empty fasta output #183

Closed carolhsb closed 3 years ago

carolhsb commented 3 years ago

Hi Robert,

Me again. Racon runs without errors, but it outputs an empty fasta file. I've tried twice and got the same result. I'm using racon version 1.4.21.

This is the command I'm using: python3 /home/carol/racon/build/bin/racon_wrapper -t 30 --split 28048207 /home/carol/short_reads_mma/concat.clean.fastq.gz ./dbgAGAIN2.illumina.sam ./dbgAGAIN2.corrected.fasta > dbgAGAIN2.corrected2.fasta

This is the log: RaconWrapper::run] preparing data with rampler [RaconWrapper::run] total number of splits: 42 [RaconWrapper::run] processing data with racon [racon::Polisher::initialize] loaded target sequences 0.289602 s

Thank you,

Carol

rvaser commented 3 years ago

Hi Carol, is that the whole log?

Best regards, Robert

carolhsb commented 3 years ago

Hi Carol, is that the whole log?

Best regards, Robert

Yes...

rvaser commented 3 years ago

Might be that it got killed due to insufficient memory. How much RAM do you have and how big is the concat.clean.fastq.gz file?

carolhsb commented 3 years ago

Might be that it got killed due to insufficient memory. How much RAM do you have and how big is the concat.clean.fastq.gz file?

The concat.clean.fastq.gz file has 57GB and I got 250 GB of RAM

rvaser commented 3 years ago

Can you please run gzip -dc concat.clean.fastq.gz | wc -c? It will print the uncompressed file size.

carolhsb commented 3 years ago

I did it and the file has 173943628896

rvaser commented 3 years ago

I think that this Illumina run will not fit into your RAM (or it will hit swap memory and take a long time finish), either with or without the wrapper.

carolhsb commented 3 years ago

I think that this Illumina run will not fit into your RAM (or it will hit swap memory and take a long time finish), either with or without the wrapper.

That's odd, because I've polished an assembly with PacBio subreads (166 G) in the same sever without any problems

rvaser commented 3 years ago

For short reads, the overhead of storing them in memory (~1.5x FASTQ) is greater than for long reads (~1x FASTQ).

carolhsb commented 3 years ago

For short reads, the overhead of storing them in memory (~1.5x FASTQ) is greater than for long reads (~1x FASTQ).

Oh, I understand it. Thank you for your time

Best regards

rvaser commented 3 years ago

Hopefully, we will update Racon in the near future to use less memory. :)