harta55 / EnTAP

Eukaryotic Non-Model Transcriptome Annotation Pipeline - Latest Release v1.4.0 - Revamped final graphics coming soon!
https://entap.readthedocs.io/en/latest/
GNU General Public License v3.0
37 stars 9 forks source link

EnTAP database locations in the Docker image #66

Open dence opened 10 months ago

dence commented 10 months ago

Hi, I have encountered an issue that might be related to this other recently opened issue ( #65). I saw the note in the documentation about the entap_config.ini included in the docker folder. Looking at that entap_config.ini, I see that the entap-db-bin should be located at /entap_outfiles/bin/entap_database.bin and the eggnog database should be located at /entap_outfiles/databases/eggnog.db. But when I do a fresh pull of the docker image and check those locations, I get a "no file found" error.

docker run plantgenomics/entap ls /entap_outfiles/bin/entap_database.bin ls: cannot access '/entap_outfiles/bin/entap_database.bin': No such file or directory

Looking at the Dockerfile, I can't see where those files are configured in the dockerfile or copied over from somewhere else. Maybe I'm missing something?

Happy to contribute a PR if there's a preferred location or command to generate those files.

harta55 commented 10 months ago

Hey! You will need to run the EnTAP configuration (EnTAP --config). This will download the databases

dence commented 10 months ago

Hi, that doesn't seem to work for me. I've done a clean install of EnTAP in a docker image (following what I saw in your Dockerfile)

Clone EnTAP repository and install requirements

RUN git clone https://github.com/harta55/EnTAP.git /tmp/entap && \ cd /tmp/entap && \ cmake CMakeLists.txt && \ make && \ make install && \ mv src/entap_graphing.py /usr/local/bin/entap_graphing.py && \ cd .. && \ rm -rf /tmp/entap

And ran the "EnTAP --config" command in an interactive session in the docker container. This was the result:

root@25bea87e1040:/app# EnTAP --config Error code: 14 Configuration file was not found and was generated for you, make sure to check the paths before continuing. INI file was not found and is required for EnTAP execution, generated at: /app/entap_config.ini root@25bea87e1040:/app# EnTAP --config Error code: 10

EggNOG DIAMOND database was not found at: /usr/local/bin//bin/eggnog_proteins.dmnd The DIAMOND test run failed.

Edit: I also tried the "EnTAP --config" on the plantgenomics/entap image and got the same result.

taprs commented 9 months ago

Bump! I have the same issue.

joeyjoe0111 commented 4 months ago

Bump! I have the same issue.

harta55 commented 4 months ago

Quick question, are you downloading the databases within the Docker image or outside of it? Depending on the location, yoiu may have to modify the EnTAP configuration files based on that

kodai-kishino commented 3 months ago

I have the same issue.

I downloaded eggnog_proteins.dmnd and set the path in the entap_config.ini file, but it is not recognised in any way.

Error code: 10

EggNOG DIAMOND database was not found at: /Folder/eggnog_proteins.dmnd The DIAMOND test run failed.

harta55 commented 2 months ago

Which version are you using? I suggest updating to the latest, this may resolve your issue. From what I'm seeing I think it should. Let me know if the issue persists.

Artifice120 commented 1 month ago

Think I figured it out at least for the singularity image.

After running the initial configuration command while running a shell in the container.

singularity shell /path/to/entap.sif
EnTAP --config

You get a "error" message;

EnTAP config ini not found, generated at: /lustre/isaac/scratch/jtorre28/entap/entap_config.ini
EnTAP run parameter ini not found, generated at: /lustre/isaac/scratch/jtorre28/entap/entap_run.params
Error code: 14
Configuration file was not found and was generated for you, make sure to check the paths before continuing.

This is actually generating the config files you need so when you run the command again just add the paths to the files that were just created

EnTAP --config --run-ini entap_run.params --entap-ini entap_config.ini

Hope this helps

harta55 commented 1 month ago

Correct, you'll need to update with the paths that were downloaded. We'll look into automatically updating the ini file after configuration is ran

Artifice120 commented 1 month ago

For adding the database and diamond paths I found the debug file in the stdout of the configuration run.

Parsing ini file at: entap_run.params
Parsing ini file at: entap_config.ini
ini files parsed, debug logging will continue at: /lustre/isaac/scratch/jtorre28/entap/entap_outfiles_solani2/debug_2024Y8M2D-10h20m22s.txt

in that example the path to the file is /lustre/isaac/scratch/jtorre28/entap/entap_outfiles_solani2/debug_2024Y8M2D-10h20m22s.txt

At the bottom of this file there will be a section with all the required paths as shown below

rsem-sam-validator: /usr/local/bin//libs/RSEM-1.3.3//rsem-sam-validator
rsem-prepare-reference: /usr/local/bin//libs/RSEM-1.3.3//rsem-prepare-reference
convert-sam-for-rsem: /usr/local/bin//libs/RSEM-1.3.3//convert-sam-for-rsem
transdecoder-long-exe: TransDecoder.LongOrfs
transdecoder-predict-exe: TransDecoder.Predict
transdecoder-m: 100
transdecoder-no-refine-starts: false
diamond-exe: /usr/local/bin/diamond
taxon: Aulacorthum_solani
qcoverage: 50.000000
tcoverage: 50.000000
contam: bacteria,
e-value: 0.000010
uninformative: conserved,predicted,unknown,unnamed,hypothetical,putative,unidentified,uncharacterized,uncultured,uninformative,
ontology_source: 0,
eggnog-map-exe: emapper.py
eggnog-map-data: /lustre/isaac/scratch/jtorre28/entap/entap_outfiles/databases
eggnog-map-dmnd: /lustre/isaac/scratch/jtorre28/entap/entap_outfiles/databases/eggnog_proteins.dmnd
interproscan-exe: interproscan.sh
interproscan-db:
hgt-donor:
hgt-recipient:
hgt-gff:

------------------------------------------------------
EnTAP Database Configuration
------------------------------------------------------
Database written to: /lustre/isaac/scratch/jtorre28/entap/entap_outfiles_solani1//bin/entap_database.bin

------------------------------------------------------
DIAMOND Database Configuration
------------------------------------------------------
DIAMOND database skipped, exists at: /lustre/isaac/scratch/jtorre28/entap/entap_outfiles_solani1//bin/uniprot_trembl

------------------------------------------------------
EggNOG Database Configuration
------------------------------------------------------
EggNOG SQL Database skipped, exists at: /lustre/isaac/scratch/jtorre28/entap/entap_outfiles_solani1//databases/eggnog.db
EggNOG DIAMOND database skipped, exists at: /lustre/isaac/scratch/jtorre28/entap/entap_outfiles_solani1//bin/eggnog_proteins.dmnd

EnTAP has completed!
Total runtime (minutes): 3

for singularity the paths seem less intuitive since the absolute paths don't seem to work.