DerKevinRiehl / transposon_annotation_tools

A set of bioconda packages for transposon annotations. During my masterthesis I downloaded lots of these tools and I want to make it easier for people to install and run these softwares.
GNU General Public License v3.0
11 stars 7 forks source link

mitetracker problem #1

Closed blavetn closed 3 years ago

blavetn commented 3 years ago

Hello I had an issue with mitetracker. When it is installed from the transposon_annotation_tools_env.yml, it is installed with python 2.7 which is not correct according to the official repository.

Nevertheless the problem does not occurred when it is installed separately with conda install -c derkevinriehl transposon_annotation_tools_mitetracker (here it is python 3.9 that is installed)

But when I have tried to run the command: mitetracker -g sequence.fasta -j job -w 3 I run in the problem that both File "/opt/anaconda3/envs/mitetracker/bin/mitetrackerLIB/MITETracker.py" File "/opt/anaconda3/envs/mitetracker/bin/mitetrackerLIB/findir.py"

import Queue ModuleNotFoundError: No module named 'Queue'

by editing Queue to queue in the above scripts seems to have solved the problem and mitetracker is now running. Also after checking the official script both contain : import queue

DerKevinRiehl commented 3 years ago

Dear blavetn, thank you very much for opening this issue.

You are right, "mitetracker" was initially written for Python 3 environments. However, I chose to package it as a Python2 package, so that it can run with all other annotation tools on a Python 2 environment. You are also right, "import Queue" is Python2, and needs to be "import queue" in Python3.

Could you please explain which issues you exactly have, when you use mitetracker when installed from transposon_annotation_tools_env.yml? What motivated you to chose a Python3 environment then?

Thank you very much, Best regards, Kevin

blavetn commented 3 years ago

Dear Kevin

when I run the resonaTe command I got this:

reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool mitetracker Counting sequences: 2 Traceback (most recent call last): File "/opt/anaconda3/envs/transposon_annotation_tools_env/bin/mitetrackerLIB/MITETracker.py", line 105, in q = queue.Queue(maxsize=0) NameError: name 'queue' is not defined Counting sequences: 2 Traceback (most recent call last): File "/opt/anaconda3/envs/transposon_annotation_tools_env/bin/mitetrackerLIB/MITETracker.py", line 105, in q = queue.Queue(maxsize=0) NameError: name 'queue' is not defined

Which lead me to check the github of MITE tracker and realize that it should be python 3. Then I try to install your package alone for mitetracker: conda create -n mitetracker -c derkevinriehl transposon_annotation_tools_mitetracker

But doing so, it is python3 which is install by default and so I got the import Queue problem which I fix with import queue. After the fix it works.

Best regards

Nicolas

DerKevinRiehl commented 3 years ago

Dear Nicolas, thank you very much for your observation :-). I just modified the python scripts so that they are able to run on both Python2 and 3, by modifying the imports:

try:
    import Queue as queue 
except ImportError:
    import queue

Please do not hesitate to report any further issues if you find some. Thank you so much, hope reasonaTE will be a useful tool for you :-).

Best regards, Kevin

blavetn commented 3 years ago

I have a comment regarding RepeatMasker, on your command :

reasonaTE -mode annotate -projectFolder resonate -projectName chrB -tool repMasker

there is no possibility to choose a species or a custom database (-species / -lib command from RepeatMasker) which is a problem as otherwise RepeatMasker think the default species is human.

I had other issue but they probably come from the conda package of RepeatMasker, I had to configure myself the database to be used. About this point the fact that it was python2 installed prevent me to do it properly (problem with module h5py), so I had to install repeatmasker independently (but anyway I need it to configure the database)

As global comment, I think it would be good that you allow the user to add tool specific parameter to complement the command that you have provided. For example:

reasonaTE -mode annotate -projectFolder resonate -projectName chrB -tool repMasker ' -species "Caenorhabditis elegans" '

Regards

Nicolas

DerKevinRiehl commented 3 years ago

Dear Nicolas, thank you very much for your suggestion.

Yes, the installation of RepeatMasker and RepeatModeler using Conda is reported to cause complications to many users. Due to the complexity of these softwares, we leave the user to install the software to his specific system.

I considered facilitating the use of the annotation tools, but did not provide options for expert users to set specific parameters. I agree, that the setting of specific parameters would be a very useful feature. This is why I implemented this feature based on your suggestion:

reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool tirvish xxxxx -mintsd 5

So users can now specify additional parameters after "xxxxx" separator symbol. In this example, the user is able to set the minimum length of target side duplications that are considered by the annotation tool tirVish. Please find more details in the (just) updated documentation of reasonaTE here: https://github.com/DerKevinRiehl/transposon_annotation_reasonaTE. (See "How to use reasonaTE" > Step 2 > Option 3 + 4)

Please install the latest reasonaTE version from Anaconda and check this feature out :-).

Thank you very much for your useful suggestion, and looking forward hearing back from you soon with further feedback (sorry for closing this issue so quickly).

Best regards, Kevin

lry19990916 commented 1 year ago

Hello, the results of using MITEstracker and the original software run are not the same, the rice genome that was run, the number of differences are not significant, but the family clustering is very different. What is the reason for this? Li Ruiying