chanzuckerberg / shasta

[MOVED] Moved to paoloshasta/shasta. De novo assembly from Oxford Nanopore reads
Other
270 stars 57 forks source link

Runtime error exception #255

Closed Mondlii closed 3 years ago

Mondlii commented 3 years ago

Hi. I am trying to use Shasta to assemble a 680m genome. I have 260x coverage on my reads and I am running Shasta on a shared University cluster where I do not have root or sudo privileges. The current run I had stopped and gave a runtime error saying

" Error 14 during mremap call for MemoryMapped::Vector: Bad address."

I am running on default settings and please see below the command I used to run. I would appreciate some assistance on this.

"shasta --threads 24 --input /projects/PromethionFlonge_Porechopped.fasta_no_contaminants.fasta --assemblyDirectory /PromethION_project"

Thank you.

paoloczi commented 3 years ago

This error message is the result of insufficient memory for this assembly. I should make that message more explanatory.

260x coverage for a 680 Mb genome is 177 Gb of reads. This would require around 1 TB of memory to assemble. How much memory does your machine have?

However, we have never tested or calibrated Shasta at such high coverage. It may be possible to assemble successfully at high coverage, but that would certainly require some tweaking of assembly parameters. When we have more coverage than about 80x, we instead increase the read length cutoff to reduce coverage to around 80x. You can do this manually, by setting a read length cutoff using command line option --Reads.minReadLength 20000 for example, with the length expressed in bases. Alternatively, you can do that automatically via --Reads.desiredCoverage 55Gb for example, for about 80x in your case. That assembly would then require a machine with about 400 GB of memory. That option will automatically select a read length cutoff that will result in the specified amount of coverage to be used.

If you don't have enough memory even for this reduced coverage, you can still attempt an assembly if your machine has that amount of space on fast storage (SSD or NVMe - disk will not do). If you want to try that, using the following Shasta memory options:

--memoryMode filesystem --memoryBacking disk

These options do not require root privilege. However they will slow down the assembly, perhaps intolerably, depending on the amount of memory your machine has. Therefore, before attempting this you should try with the default memory options, and attempt the above only if you run out of memory again despite the reduced coverage.

Also, you should not use default options. Shasta default assembly options are not necessarily optimized for any application, so you should instead use a Shasta configuration file consistent with your data. If your reads were generated using recent ONT hardware and software, you should use --config Nanopore-Sep2020.conf, adjusting the path to reflect the location of the configuration file. You will find this and other configuration files in the conf directory in the Shasta repository, or in the tar file for a release.

Mondlii commented 3 years ago

Thank you for your response and help. The solution worked and due to the long queues on the server the run just finished. Thank you very much