jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
380 stars 80 forks source link

Problem updating database location #54

Closed fconstancias closed 4 years ago

fconstancias commented 4 years ago

Dear Javier,

Thanks a ton for developping SqueezeMeta. I am trying to use SqueezeMeta in order to generate gene catalog from waste water metagenomes.

I followed the installation instructions but when I run the pipeline on a toy dataset I got the following error :

(squeezemeta) constancias@scelse:/datadrive05/Flo/EZ/NEW_assembly/tests_gene_catalog$ perl /datadrive05/Flo/tools/SqueezeMeta/scripts/SqueezeMeta.pl -m coassembly -p EZ-squeezemeta -s EZ_test_sqeezeMeta.samples -f sub --nobins --nomaxbin --nometabat -t 10 

SqueezeMeta v1.0.0 - (c) J. Tamames, F. Puente-Sánchez CNB-CSIC, Madrid, SPAIN

Please cite: Tamames & Puente-Sanchez, Frontiers in Microbiology 9, 3349 (2019). doi: https://doi.org/10.3389/fmicb.2018.03349

Run started Mon Jan 13 16:47:14 2020 in coassembly mode
Now creating directories
Reading configuration from /datadrive05/Flo/EZ/NEW_assembly/tests_gene_catalog/EZ-squeezemeta/SqueezeMeta_conf.pl
7 samples found: ESMetFM01 ESMetFM47 ESMetFM37 ESMetFM09 ESMetFM17 ESMetFM26 ESMetFM06
Now merging files
[0 seconds]: STEP1 -> RUNNING CO-ASSEMBLY: 01.run_assembly.pl (megahit)
  Running assembly with megahit

  Renaming contigs
  Counting length of contigs
  Contigs stored in /datadrive05/Flo/EZ/NEW_assembly/tests_gene_catalog/EZ-squeezemeta/results/01.EZ-squeezemeta.fasta
  Number of contigs: 86293
[16 minutes, 7 seconds]: STEP2 -> RNA PREDICTION: 02.rnas.pl
  Running barrnap (Seeman 2014, Bioinformatics 30, 2068-9) for predicting RNAs:  Bacteria[17:03:21] Can't find database: /datadrive04/db/sqeezeMeta/db/test_data/-h/db/bac.hmm
Error running command:    /datadrive05/Flo/tools/SqueezeMeta/bin/barrnap --quiet --threads 10 --kingdom bac --reject 0.1 /datadrive05/Flo/EZ/NEW_assembly/tests_gene_catalog/EZ-squeezemeta/intermediate/02.EZ-squeezemeta.maskedrna.fasta --dbdir /datadrive04/db/sqeezeMeta/db/test_data/-h/db > /datadrive05/Flo/EZ/NEW_assembly/tests_gene_catalog/EZ-squeezemeta/temp/bac.gff at /datadrive05/Flo/tools/SqueezeMeta/scripts/02.rnas.pl line 54.
Stopping in STEP2 -> 02.rnas.pl
Died at /datadrive05/Flo/tools/SqueezeMeta/scripts/SqueezeMeta.pl line 663.

Checking the error revealed that I have made a mistake setting up the databases. Actually I wanted to look for help of the download_databases.pl script download_databases.pl -h and it generated another database at the location /datadrive04/db/sqeezeMeta/db/test_data/-h/db.

Then I removed this folder and tried to set up the databases again.

I got these errors

perl /datadrive05/Flo/tools/SqueezeMeta/scripts/preparing_databases/download_databases.pl /datadrive04/db/sqeezeMeta/db/         
rm: cannot remove '/datadrive04/db/sqeezeMeta/db/test.tar.gz': No such file or directory
rm: cannot remove '/datadrive05/Flo/tools/SqueezeMeta/lib/classifier.tar.gz': No such file or directory
rm: cannot remove '/datadrive04/db/sqeezeMeta/db/SqueezeMetaDB.tar.gz': No such file or directory

Downloading and unpacking test data...

and I got the same error again running because the script is still looking for the database in the privious location (datadrive04/db/sqeezeMeta/db/test_data/-h/db).

(squeezemeta) constancias@scelse:/datadrive05/Flo/EZ/NEW_assembly/tests_gene_catalog$ perl /datadrive05/Flo/tools/SqueezeMeta/scripts/SqueezeMeta.pl -m coassembly -p EZ-squeezemeta -s EZ_test_sqeezeMeta.samples -f sub --nobins --nomaxbin --nometabat -t 10

Thanks in advance for your help.

fpusan commented 4 years ago

Hi!

Can you send us the /datadrive05/Flo/tools/SqueezeMeta/scripts/SqueezeMeta_conf.pl file?

fconstancias commented 4 years ago

I see ... $databasepath = "/datadrive04/db/sqeezeMeta/db/test_data/-h/db";

SqueezeMeta_conf.pl.txt

fconstancias commented 4 years ago

Actually I have the following error when I try to run the download_databases.plscript.

db/DB_BUILD_DATE db/selected_marker_sets.tsv db/bacar_marker.hmm db/hmms/ db/hmms/phylo.hmm db/hmms/checkm.hmm.ssi db/hmms/checkm.hmm db/hmms/phylo.hmm.ssi db/img/ db/img/img_metadata.tsv db/euk.hmm db/LCA_tax/ db/LCA_tax/parents.db db/LCA_tax/taxid.db db/LCA_tax/parents.txt db/pfam/ db/pfam/Pfam-A.hmm.dat db/pfam/tigrfam2pfam.tsv db/taxdb.btd

Updating configuration... Died at /datadrive05/Flo/tools/SqueezeMeta/scripts/preparing_databases/download_databases.pl line 57.

I guess this is when the $databasepathis not updated

fconstancias commented 4 years ago

It seems that manually updating the $databasepath solved my problem.

jtamames commented 4 years ago

Very nice to hear that. Thanks a lot for your feedback!

fconstancias notifications@github.com escribió:

It seems that manually updating the $databasepath solved my problem.

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/jtamames/SqueezeMeta/issues/54#issuecomment-574093708

fpusan commented 4 years ago

This is a nice step forward, but you will need a second file that is not being generated by the script because it dies in line 57. WIthout that file SqueezeMeta should run fine until step 18, but die when trying to run checkm.

Here are two quick questions to try to troubleshoot the issue:

If both are true, can you try the following?

And then send me the download_databases.log file?

fconstancias commented 4 years ago

Does the directory /datadrive05/Flo/tools/SqueezeMeta/lib/checkm/ exist?

yes

If so, does your user have writing permissions on it?

yes

perl /datadrive05/Flo/tools/SqueezeMeta/scripts/preparing_databases/download_databases.pl /datadrive04/db/sqeezeMeta/test2 > download_databases.log 2>&1

Should I keep the script going ?

Please find below the output when I killed ctrl +c it

download_databases.log

fpusan commented 4 years ago

Yes please, keep it going until it dies or finishes.

fpusan commented 4 years ago

Did it work in the end? if so I'll close the issue

fconstancias commented 4 years ago

Hi @fpusan. Sorry, but it stoped because my worksapce is full on the cluster.