DerKevinRiehl / transposon_annotation_reasonaTE

Transposon annotation tool "resonaTE" (part of TransposonUltimate)
GNU General Public License v3.0
16 stars 1 forks source link

NCBICDD1000 Hanging #28

Open kristina28 opened 5 months ago

kristina28 commented 5 months ago

Hello! First, thank you for this tool - it's been quite helpful to me in the past! However, as of about a month ago the NCBICDD1000 tool has stopped working for larger genomes, even genomes I've annotated successfully in the past (the test data still also works). For the plant genomes I'm working on, however, the tool hangs indefinitely without completing any of the scan files, even when using very high memory and core settings (far more than I needed in the past). I'm not sure what is wrong as no error message is produced... Is there any environment setting or fasta file characteristic that is required for the script? It's possible our computing admins have changed a setting on our server, so knowing what is required would be helpful for troubleshooting. Thanks again.

DerKevinRiehl commented 5 months ago

Dear Kristina, thanks for your interest in our tool and sorry to hear about your problem.

That sounds quite tricky with no further information to answer, especially, that the admins could have changed "something".

Are you using the exact conda environment setup you used before when it still worked?

Can you please show me your conda environment you are using now? To do so, just open the terminal, activate the environment you are using and type

python --version
conda list

And a general comment for this problem: It might be, that parts of the packages used are rather log(O²) than log(O), so you might consider running the software with smaller fasta files (e.g. two times a half each).

Best, Kevin

kristina28 commented 4 months ago

Yeah, sorry for giving you so little information, I didn't have much myself which was why I was stuck. I don't believe the admins have changed any global default settings recently (I asked and they didn't mention anything) but I wasn't sure if there were certain minimums that needed to be met for things like Java heap size, memory, cores, or anything else along those lines, which I could double check were met.

I am using the exact conda environment as before.

the output of python --version:

Python 2.7.18 :: Anaconda, Inc

The output of conda list I've attached in a text file because it's pretty long.

reasonaTE-env-list.txt

I did try running the smallest chromosome as a separate fasta file to see if the smaller size would make a difference, but it also failed for NCBICDD1000 annotations.

Thanks for your help!

DerKevinRiehl commented 4 months ago

Dear kristina, thanks for the informaton. My first view is that your environment looks pretty good. I am also wondering why you get stuck, even with a small genome.

Actually, NCBICDD1000 is a tool from https://github.com/DerKevinRiehl/transposon_annotation_tools So maybe you can, just to see whats going on and wrong, use at is in this example mentioned:

mkdir result
proteinNCBICDD1000 -fastaFile demo.fasta -resultFolder result 

Please use a simple demo.fasta, e.g. the one I provide or yours, and show me console output and the output in the folder (in case it does not get stuck). How long do you wait for the tool to run, and "how small" is your small file in BP? Can you try the demo.fasta from https://github.com/DerKevinRiehl/transposon_annotation_tools ?

Thanks for your reply, Best, Kevin