DerKevinRiehl / transposon_annotation_reasonaTE

Transposon annotation tool "resonaTE" (part of TransposonUltimate)
GNU General Public License v3.0
16 stars 1 forks source link

How to get reasonaTE to use RepeatMasker? #19

Closed soungalo closed 1 year ago

soungalo commented 1 year ago

reasonaTE requires python 2.7, whereas RM requires python 3. Is there a way to actually get reasonaTE to run RM? I couldn't figure out how they can live in the same conda env. This is also important since RepeatModeler list RepeatMasker as a dependency.
Thanks!

DerKevinRiehl commented 1 year ago

Hello Lior Glick, thank you for your interest in transposonUltimate.

Well it depends how you install RepeatMasker and RepeatModeler (e.g. as a conda package or plain).

Maybe you can install RepeastMasker & RepeatModeler just plain (so not as a conda package)?

Did you already find a solution?

Best, Kevin

soungalo commented 1 year ago

Yes, I found a solution, but it's rather hacky. Here's what I did:

  1. Created a conda env with all repeatMasker and repeatModeler dependencies, but without the packages themselves. Here's the env yml:
    name: repeatMaskerModeler
    channels:
    - conda-forge
    - bioconda
    dependencies:
    - python=3
    - perl=5
    - h5py
    - hmmer
    - trf
    - recon
    - repeatscout
    - ucsc-fatotwobit
    - ucsc-twobittofa
    - ucsc-twobitinfo
    - perl-json
    - perl-devel-size
    - perl-uri
    - perl-lwp-protocol-https
  2. Downloaded, extracted and configured repeatMasker, pointing to the executables in the conda env.
  3. Same for repeatModeler.
  4. Added repeatMasker and repeatModeler to $PATH
  5. Modified the code of AnnotationCommander.py in the transposon_annotation_tools_env conda env (found under /share/TransposonAnnotator_reasonaTE):
    
    def runRepeatModeler(projectFolderPath, addCommand):
    copyfile(os.path.join(projectFolderPath,"sequence.fasta"), os.path.join(projectFolderPath, "repeatmodel", "sequence.fasta"))
    env_path="~/miniconda3/envs/repeatMaskerModeler"
    os.system("cd "+os.path.join(projectFolderPath,"repeatmodel")+" && conda run -p %s BuildDatabase -name sequence_index -engine ncbi sequence.fasta" % env_path)
    if(addCommand==""):
        os.system("cd "+os.path.join(projectFolderPath,"repeatmodel")+" && conda run -p %s RepeatModeler -engine ncbi -threads 10 -database sequence_index" % env_path)
    else:
        os.system("cd "+os.path.join(projectFolderPath,"repeatmodel")+" && conda run -p %s RepeatModeler -engine ncbi -database sequence_index " % env_path +addCommand)
    os.remove(os.path.join(projectFolderPath, "repeatmodel", "sequence.fasta"))

def runRepeatMasker(projectFolderPath, addCommand): copyfile(os.path.join(projectFolderPath,"sequence.fasta"), os.path.join(projectFolderPath, "repMasker", "sequence.fasta")) env_path="~/miniconda3/envs/repeatMaskerModeler" if(addCommand==""): os.system("cd "+os.path.join(projectFolderPath,"repMasker")+" && conda run -p %s RepeatMasker -pa 10 sequence.fasta" % env_path) else: os.system("cd "+os.path.join(projectFolderPath,"repMasker")+" && conda run -p %s RepeatMasker sequence.fasta" % env_path + addCommand) os.remove(os.path.join(projectFolderPath, "repMasker", "sequence.fasta"))


I added `conda run` statements so the `repeatMaskerModeler` env can be used. Also, in `runRepeatModeler`, I had to change `-pa` to `-threads` as the former is deprecated and creates an error with repeatModeler v2.0.4  

Suprisingly enough, this actually worked (:  
But a more elegant solution would be welcome.
DerKevinRiehl commented 1 year ago

Dear Lior Glick, happy to hear you found a solution.

Yes, we plan to introduce more elegant solutions in the next big update.

Thanks for letting us know. Best, Kevin