Dfam-consortium / RepeatModeler

De-Novo Repeat Discovery Tool
Other
194 stars 22 forks source link

Docker and RepBase use #87

Open pedrobdfp opened 4 years ago

pedrobdfp commented 4 years ago

Hi!

I am trying to use RepBase as the default database for RepeatModeler / RepeatMasker, however, I am running into an issue. I am currently using the Docker version of the software (dfam/tetools).

While following instructions on step3 here: , I run into this error:

cp: cannot create regular file '/opt/RepeatMasker/RepBaseRepeatMaskerEdition-20181026.tar.gz': Permission denied

"sudo" does not work either.

Is it impossible to use RepBase with a docker version of RepeatMasker/Modeler? Or is there another way to set up and have it use RepBase as default?

I would use Dfam but my previous run with it resulted in 100% of my repeats as Unclassified.

Thanks; Pedro

jebrosen commented 4 years ago

You are seeing this error because the contents of the docker container are read-only by default. There are some specific instructions for using RepBase RepeatMasker Edition with the Dfam TE Tools Container here, which should work for you: https://github.com/Dfam-consortium/TETools#using-repbase-repeatmasker-edition. Please let us know if those instructions work for you (or don't)!

pedrobdfp commented 4 years ago

I apologize, I completely missed this README. It worked fine with this!

Just one more question on running with RepBase. With this, RepeatModeler will NOT generate a -consensi.fa.classified file, and I will have to run RepeatClassifier, right?

How can I get RepeatClassifier to use RepBase in this scenario?

Thanks; Pedro

jebrosen commented 4 years ago

With this, RepeatModeler will NOT generate a -consensi.fa.classified file, and I will have to run RepeatClassifier, right?

Ah, right! The instructions on that page won't help you as written, since RepeatModeler and RepeatClassifier do not accept a -libdir parameter that they would pass on to RepeatMasker.

But RepeatMasker also reads an environment variable for that configuration value. I have not tested it, but I think this should do what you want: LIBDIR=./Libraries RepeatModeler <parameters> or LIBDIR=./Libraries RepeatClassifier <parameters>

If you can confirm that the classification is actually different (hopefully, an improvement) when adding LIBDIR= to point to your custom library directory, that would be a big help. I will do some tests with this and add it to those instructions.

pedrobdfp commented 4 years ago

Ok, thanks. I will let you know. My system has only got 16 GB of RAM so the RepeatModeler run should take a long time.

pedrobdfp commented 4 years ago

Update: The run finished and defining LIBDIR did work!