Closed fmaumusINRA closed 9 months ago
Wow...that's intimidating. For larger runs ( e.g full genomes or large chromosomes ) I would recommend breaking up the sequence ( say 50MB non-overlapping batches ) and running them separately through RepeatMasker. The obvious disadvantage to this is merging/adjusting of result files. I do have a Nextflow pipeline that I use for full genome runs that does this automatically. If you're familiar with Nextflow and would like to give it a try, let me know.
Thanks a lot, Rob! Your nextflow pipeline would be wonderfully helpful! How can you share that?
Ok...I just made the github project for it public: https://github.com/Dfam-consortium/RepeatMasker_Nextflow Use the issue tracker on RepeatMasker_Nextflow project if you have any questions.
Thank you so much, my PhD student should clone this today. All the best, Florian
Thank you very much, Robert. The nextflow version allowed us running on chromosomes over 1Gbp.
For your information, in our hands, line 313 in script /RepeatMasker_Nextflow.nf: export PATH=${ucscToolsDir}/\$PATH
had to be changed to: export PATH=${ucscToolsDir}/:\$PATH
Kind regards, Florian
On Tue, 28 Nov 2023 at 02:28, Robert Hubley @.***> wrote:
Closed #226 https://github.com/rmhubley/RepeatMasker/issues/226 as completed.
— Reply to this email directly, view it on GitHub https://github.com/rmhubley/RepeatMasker/issues/226#event-11077731937, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI3NMZ4ZS4YSKRJ67EJFE63YGU44HAVCNFSM6AAAAAA2S5N64SVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJRGA3TONZTGE4TGNY . You are receiving this because you authored the thread.Message ID: @.***>
-- Florian Maumus | INRAE http://www.inra.fr/en - URGI http://urgi.versailles.inra.fr/ | +33 1 30 83 31 74
Dear RepeatMasker friends,
We are having an issue with RepeatMasker never ending the postprocess. We are working with an assembly that contains chromosomes that are above 1 Gbp and I am wondering if there could be any size limit in the code that could cause this issue.
Thanks a lot for your help, Florian