jlab-code / MethylStar

A fast and robust pre-processing pipeline for bulk or single-cell whole-genome bisulfite sequencing (WGBS) data.
GNU General Public License v3.0
30 stars 6 forks source link

MethImpute run error #12

Closed anupullanhi closed 3 years ago

anupullanhi commented 3 years ago

Hello

I am running Methylstar pipeline using reference as mm10. i got the below given error:

[1] "It's first time you are running Methimpute for this data-set!" Scanning for ambiguous nucleotides ... 263.42s Extracting cytosines from forward strand ...Error: cannot allocate vector of size 1.8 Gb Execution halted Warning message: system call failed: Cannot allocate memory sort: cannot read: /results/methimpute-out/file-processed.lst: No such file or directory ./src/bash/methimpute.sh: line 12: [: too many arguments

I am running MethylStar in docker.

Any help regarding this error is appreciated. Thanks in advance.

shahryary commented 3 years ago

Hi @anupullanhi

Thank you for using the pipeline.

Methimpute using HMM algorithm for methylation status calling and it's resource-hungry in terms of memory. I suggest it would be better to run Methimpute with a system that has sufficient RAM. (it depends on the size of genome file and cx-report file too)

Additionally, There is an option in Methimpute to specify cytosine. In that case, the model just using a specific cytosine which will end up with less memory consumption. You can find it under menu "configuration part" , then "methimpute" and there is an option "Run Context: All/ CG| CHG| CHH " try to first run with e.x: "CG" to see how is going, hopefully it should solve your problem .

Finally, Methimpute module is an option that we included in our pipeline; it's possible to use any other methylation caller or even cx-reports after pre-processed the data.

Please let me know if you have any questions.

anupullanhi commented 3 years ago

Hello, I changed the configuration to "CG" for running methimpute and got the below-given error.

image

Any help regarding this error is appreciated. Thanks in advance.

shahryary commented 3 years ago

Unfortunately, as I mentioned before, Methimpute is resource hungry when you have a large genome size. We will update this module in our pipeline in the next version. Increasing the system RAM could help get out the problem, or you could use an alternative option for methylation calls instead of Methimpute. However, we'll keep updating this module to improve pipeline performance.