flass / pantagruel

a pipeline for reconciliation of phylogenetic histories within a bacterial pangenome
GNU General Public License v3.0
46 stars 7 forks source link

InterproScan was not found on this machine: cannot run this (facultative) task of Pantagruel pipeline; exit now #33

Closed mattbawn closed 4 years ago

mattbawn commented 4 years ago

Hi Florent,

To bypass the previous issue. I ran Pantagruel without providing annotation and got as far as 04.

with the error:

InterproScan was not found on this machine: cannot run this (facultative) task of Pantagruel pipeline; exit now

I tried to bypass the InterproScan step by running:

pantagruel -i database/environ_pantagruel_database.sh TASK5 TASK6 TASK7 TASK8 TASK9

but the job failed with no errors.

We then installed InterproScan in the pantagruel singularity container

but I again got:

[2019-12-11 00:59:54] Pantagruel pipeline task 3: initiate SQL database and load genomic object relationships.
Create new task folder '/nbi/Research-Groups/IFR/Rob-Kingsley/R134_Pantagruel/Un_annotaed_go/database/03.database'
currently set variables:
database=/nbi/Research-Groups/IFR/Rob-Kingsley/R134_Pantagruel/Un_annotaed_go/database/03.database dbname=database metadata=/nbi/Research-Groups/IFR/Rob-Kingsley/R134_Pantagruel/Un_annotaed_go/database/00.input_data/genome_infos/assembly_metadata assemblyinfo=/nbi/Research-Groups/IFR/Rob-Kingsley/R134_Pantagruel/Un_annotaed_go/database/00.input_data/genome_infos/assembly_info protali=/nbi/Research-Groups/IFR/Rob-Kingsley/R134_Pantagruel/Un_annotaed_go/database/02.gene_alignments protfamseqtab=/nbi/Research-Groups/IFR/Rob-Kingsley/R134_Pantagruel/Un_annotaed_go/database/01.seqdb/protein_families/all_proteomes.nr.mmseqs_clusterdb_default_clusters_fasta.tab protorfanclust=PANTAGP000000 cdsorfanclust=PANTAGC000000 usergenomeinfo=/nbi/Research-Groups/IFR/Rob-Kingsley/R134_Pantagruel/Un_annotaed_go/genomes/strain_infos_database.txt usergenomefinalassdir=/nbi/Research-Groups/IFR/Rob-Kingsley/R134_Pantagruel/Un_annotaed_go/database/00.input_data/genbank-format_assemblies gp2ass=/nbi/Research-Groups/IFR/Rob-Kingsley/R134_Pantagruel/Un_annotaed_go/database/00.input_data/genomesource_assemblyid_assemblyname.txt
/nbi/Research-Groups/IFR/Rob-Kingsley/R134_Pantagruel/Un_annotaed_go/database/03.database
Pantagruel pipeline task 3: complete.

--2019-12-11 00:59:55--  http://www.uniprot.org/docs/speclist
Resolving www.uniprot.org (www.uniprot.org)... 128.175.245.185, 193.62.192.81
Connecting to www.uniprot.org (www.uniprot.org)|128.175.245.185|:80... failed: Connection timed out.
Connecting to www.uniprot.org (www.uniprot.org)|193.62.192.81|:80... failed: Connection timed out.
Retrying.

--2019-12-11 01:04:11--  (try: 2)  http://www.uniprot.org/docs/speclist
Connecting to www.uniprot.org (www.uniprot.org)|128.175.245.185|:80... failed: Connection timed out.
Connecting to www.uniprot.org (www.uniprot.org)|193.62.192.81|:80... failed: Connection timed out.
Retrying.

--2019-12-11 01:08:27--  (try: 3)  http://www.uniprot.org/docs/speclist
Connecting to www.uniprot.org (www.uniprot.org)|128.175.245.185|:80... failed: Connection timed out.
Connecting to www.uniprot.org (www.uniprot.org)|193.62.192.81|:80... failed: Connection timed out.
Retrying.

--2019-12-11 01:12:45--  (try: 4)  http://www.uniprot.org/docs/speclist
Connecting to www.uniprot.org (www.uniprot.org)|128.175.245.185|:80... failed: Connection timed out.
Connecting to www.uniprot.org (www.uniprot.org)|193.62.192.81|:80... failed: Connection timed out.
Retrying.

--2019-12-11 01:17:03--  (try: 5)  http://www.uniprot.org/docs/speclist
Connecting to www.uniprot.org (www.uniprot.org)|128.175.245.185|:80... failed: Connection timed out.
Connecting to www.uniprot.org (www.uniprot.org)|193.62.192.81|:80... failed: Connection timed out.
Retrying.

--2019-12-11 01:21:23--  (try: 6)  http://www.uniprot.org/docs/speclist
Connecting to www.uniprot.org (www.uniprot.org)|128.175.245.185|:80... failed: Connection timed out.
Connecting to www.uniprot.org (www.uniprot.org)|193.62.192.81|:80... failed: Connection timed out.
Retrying.

--2019-12-11 01:25:43--  (try: 7)  http://www.uniprot.org/docs/speclist
Connecting to www.uniprot.org (www.uniprot.org)|128.175.245.185|:80... failed: Connection timed out.
Connecting to www.uniprot.org (www.uniprot.org)|193.62.192.81|:80... failed: Connection timed out.
Retrying.

--2019-12-11 01:30:05--  (try: 8)  http://www.uniprot.org/docs/speclist
Connecting to www.uniprot.org (www.uniprot.org)|128.175.245.185|:80... failed: Connection timed out.
Connecting to www.uniprot.org (www.uniprot.org)|193.62.192.81|:80... failed: Connection timed out.
Retrying.

--2019-12-11 01:34:27--  (try: 9)  http://www.uniprot.org/docs/speclist
Connecting to www.uniprot.org (www.uniprot.org)|128.175.245.185|:80... failed: Connection timed out.
Connecting to www.uniprot.org (www.uniprot.org)|193.62.192.81|:80... failed: Connection timed out.
Retrying.

--2019-12-11 01:38:51--  (try:10)  http://www.uniprot.org/docs/speclist
Connecting to www.uniprot.org (www.uniprot.org)|128.175.245.185|:80... failed: Connection timed out.
Connecting to www.uniprot.org (www.uniprot.org)|193.62.192.81|:80... failed: Connection timed out.
Retrying.

--2019-12-11 01:43:15--  (try:11)  http://www.uniprot.org/docs/speclist
Connecting to www.uniprot.org (www.uniprot.org)|128.175.245.185|:80... failed: Connection timed out.
Connecting to www.uniprot.org (www.uniprot.org)|193.62.192.81|:80... failed: Connection timed out.
Retrying.

--2019-12-11 01:47:40--  (try:12)  http://www.uniprot.org/docs/speclist
Connecting to www.uniprot.org (www.uniprot.org)|128.175.245.185|:80... failed: Connection timed out.
Connecting to www.uniprot.org (www.uniprot.org)|193.62.192.81|:80... failed: Connection timed out.
Retrying.

--2019-12-11 01:52:04--  (try:13)  http://www.uniprot.org/docs/speclist
Connecting to www.uniprot.org (www.uniprot.org)|128.175.245.185|:80... failed: Connection timed out.
Connecting to www.uniprot.org (www.uniprot.org)|193.62.192.81|:80... failed: Connection timed out.
Retrying.

--2019-12-11 01:56:29--  (try:14)  http://www.uniprot.org/docs/speclist
Connecting to www.uniprot.org (www.uniprot.org)|128.175.245.185|:80... failed: Connection timed out.
Connecting to www.uniprot.org (www.uniprot.org)|193.62.192.81|:80... failed: Connection timed out.
Retrying.

--2019-12-11 02:00:53--  (try:15)  http://www.uniprot.org/docs/speclist
Connecting to www.uniprot.org (www.uniprot.org)|128.175.245.185|:80... failed: Connection timed out.
Connecting to www.uniprot.org (www.uniprot.org)|193.62.192.81|:80... failed: Connection timed out.
Retrying.

--2019-12-11 02:05:18--  (try:16)  http://www.uniprot.org/docs/speclist
Connecting to www.uniprot.org (www.uniprot.org)|128.175.245.185|:80... failed: Connection timed out.
Connecting to www.uniprot.org (www.uniprot.org)|193.62.192.81|:80... failed: Connection timed out.
Retrying.

--2019-12-11 02:09:42--  (try:17)  http://www.uniprot.org/docs/speclist
Connecting to www.uniprot.org (www.uniprot.org)|128.175.245.185|:80... failed: Connection timed out.
Connecting to www.uniprot.org (www.uniprot.org)|193.62.192.81|:80... failed: Connection timed out.
Retrying.

--2019-12-11 02:14:06--  (try:18)  http://www.uniprot.org/docs/speclist
Connecting to www.uniprot.org (www.uniprot.org)|128.175.245.185|:80... failed: Connection timed out.
Connecting to www.uniprot.org (www.uniprot.org)|193.62.192.81|:80... failed: Connection timed out.
Retrying.

--2019-12-11 02:18:31--  (try:19)  http://www.uniprot.org/docs/speclist
Connecting to www.uniprot.org (www.uniprot.org)|128.175.245.185|:80... failed: Connection timed out.
Connecting to www.uniprot.org (www.uniprot.org)|193.62.192.81|:80... failed: Connection timed out.
Retrying.

--2019-12-11 02:22:55--  (try:20)  http://www.uniprot.org/docs/speclist
Connecting to www.uniprot.org (www.uniprot.org)|128.175.245.185|:80... failed: Connection timed out.
Connecting to www.uniprot.org (www.uniprot.org)|193.62.192.81|:80... failed: Connection timed out.
Giving up.

Traceback (most recent call last):
  File "/opt/software/pantagruel/scripts/pantagruel_sqlitedb_genome_populate.py", line 343, in <module>
    raise ValueError, "specified input file '%s' cannot be found"%nf
ValueError: specified input file '/nbi/Research-Groups/IFR/Rob-Kingsley/R134_Pantagruel/Un_annotaed_go/database/03.database/speclist' cannot be found
Error: no such column: code
Error: no such column: code
Error: no such column: cds_code
Traceback (most recent call last):
  File "/opt/software/pantagruel/scripts/genbank2code_fastaseqnames.py", line 43, in <module>
    pool.map(genbank2code, iter((nfinfa, transnames, dirout, queue) for nfinfa in lnfinfa))
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 253, in map
    return self.map_async(func, iterable, chunksize).get()
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 572, in get
    raise self._value
KeyError: 'C37_04661'
Traceback (most recent call last):
  File "/opt/software/pantagruel/scripts/genbank2code_fastaseqnames.py", line 43, in <module>
    pool.map(genbank2code, iter((nfinfa, transnames, dirout, queue) for nfinfa in lnfinfa))
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 253, in map
    return self.map_async(func, iterable, chunksize).get()
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 572, in get
    raise self._value
KeyError: 'C37_04661'
ERROR: Pantagruel pipeline task 4: failed.
[2019-12-11 02:27:17] Pantagruel pipeline task 4: use InterProScan to functionally annotate proteins in the database.
Create new task folder '/nbi/Research-Groups/IFR/Rob-Kingsley/R134_Pantagruel/Un_annotaed_go/database/04.functional'
InterproScan was not found on this machine: cannot run this (facultative) task of Pantagruel pipeline; exit now

Can you please tell me how Pantagruel calls InterproScan?

Thanks,

Matt

flass commented 4 years ago

HI Matt,

several things are to consider here:

I believe the later errors in Python scripts follw from that

flass commented 4 years ago

By the way, is this error about interproscan at all ? after installing Interproscan, you seem to have run the task 3; interproscan annotation is task 4. can you update the thread regarding the interproscan error when you have properly tested task 4 please? Note it is not an error really, just a message saying that you can't use that (facultative!) task when you don't have the underlying software.

Also, you got me interested saying you have a singularity container set up for pantagruel. Havve you used the Dockerfile set up by @pveber here: https://github.com/flass/pantagruel/blob/master/etc/Dockerfile ?

or did you set up your own? is it it a proper Singularity recipe? if the latter, could you please circulate the recipe, as it would be very valuable to other users - and myself, as i admit I could not find the time to get started on this, and I would really appreciate having a basis to work upon for releasing some stable installation (finally!)

flass commented 4 years ago

The Dockerfile has now been fully tested and the image generated from it support properly the running of Pantagruel (except task 04 that needs InterProScan installed on the side). This is all documented in the INSTALL doc. A static docker image will soon be released on dockerhub.