Racon Fails without error

nhartwic commented 4 years ago

Basically, the title. I'm currently trying to use racon to polish some mammal draft assemblies. I have about 60x coverage and am using raconwrapper to reduce memory usage. Racon fails without any error being issued after running for approximately 2 days. I'm encountering this issue on both racon v1.3.3 and 1.4.3. Exit status for the process is 1. Max memory used is 234g which is only about half of what I have available. I'm invoking racon using the following command....

racon_wrapper -t 32 --split 150000000 \
    MF-122017F.fastq.gz \
    MF-122017F.flye.paf \
    MF-122017F.flye.fasta \
    > MF-122017F.flye.racon1.fasta \

Any advice or assistance is appreciated.

rvaser commented 4 years ago

Hello, not sure what to tell you. How large are your files? How much disk space have you available? Is there any log you can share?

Sorry for the wait. Best regards, Robert

nhartwic commented 4 years ago

Well, I ran on 8 out of 10 of my samples without issue. For the two remaining samples, the draft assemblies are 2.4G. The paf files vary in size. One of the samples is about 30x coverage and has a 1.1G paf while the other is 60x coverae and has a 4.0G paf.

I'm currently trying these same commands again. I can verify that the work directory is filling nicely with intermediate output files in the rance of 100-150M which fits for the parameters I'm using. This happened on previous attempts too though.

Disk space shouldn't be an issue, I have several terrabytes available. I'm limited to 512G memory though I only seem to be using about half of it on any given run. I'll make sure I keep the log files on this test. Previous logs haven't been helpful though. Here is just a copy+paste of the captured stderr...

[RaconWrapper::run] preparing data with rampler [RaconWrapper::run] processing data with racon [racon::Polisher::initialize] loaded target sequences 1.045 s

...IDK, its really weird. It seems like everything is working, but the final concatenation to stdout doesn't occur and all the intermediate files get cleaned up by the process before it exits with code 1. (the only reason I know it exits with code 1 is that exit codes are captured by the job scheduler my cluster uses.)

I want to give more information. I just don't really have any and am out of ideas for things to check.

rvaser commented 4 years ago

You can replace this with pass (don't forget to run cmake and make again). This will disable deletion of the working directory. You can then manually run the chunks that are problematic and maybe get the error message somehow?

On the other hand, which Racon version are you using?

Sorry for waiting! Best regards, Robert

nhartwic commented 4 years ago

The initial runs were run using racon 1.4.3 as part of a snakemake based pipeline. After things erred out, I switched to one of my own older environments that is more convenient to access and has racon 1.3.3 installed. I'm seeing the same behavior on both.

It seems my latest runs were killed as a result of cluster maintenance. I'll start the jobs again and get back to you in a couple days. Thanks for the patience here.

nhartwic commented 4 years ago

Ok, previous reports were mistaken. Apologies for the confusion. It seems I was incorrect in how much memory the node I was on had access to. This seems to just be a memory problem. The obvious ways to reduce memory consumption are to reduce the split size or add sub sampling. I can't do the former due to our assembly featuring 100 mb contigs. Sub sampling will work though.

I'm also curious if the number of threads has a significant memory impact.

rvaser commented 4 years ago

Number of threads should not have an impact on memory. Do you know if the polishing ended on the largest contigs or on smaller ones?

nhartwic commented 4 years ago

I'm not sure how to figure that out.

rvaser commented 4 years ago

Well if subsampling works, then you do not have to do anything. If not, you can modify the wrapper script as mentioned above (https://github.com/isovic/racon/issues/141#issuecomment-536338811) and see which chunk files are polished and which not.

isovic / racon

Racon Fails without error #141