sanger-pathogens / companion

This repository has been archived, currently maintained version is at https://github.com/iii-companion/companion
http://companion.sanger.ac.uk
ISC License
21 stars 19 forks source link

error from update_references.lua #80

Open xinliu005 opened 6 years ago

xinliu005 commented 6 years ago

@satta While running /home/xin/.nextflow/assets/sanger-pathogens/companion/bin/update_references.lua I continuously got the following error message: "tool './bin/update_references.lua' not found; option -help lists possible tools" . What is the problem?

satta commented 6 years ago

Just to debug: Have you tried calling the tool by using its full path? I.e.

/home/xin/.nextflow/assets/sanger-pathogens/companion/bin/update_references.lua

instead of however you called it before (so no relative paths)? Also, what GenomeTools (gt) version are you using? Can you try calling the tool as:

gt /home/xin/.nextflow/assets/sanger-pathogens/companion/bin/update_references.lua

and see what happens?

xinliu005 commented 6 years ago

Thanks for you prompt reply. xin@compare-vm-1:/analysis/xin/parasite/bin$ gt /home/xin/.nextflow/assets/sanger-pathogens/companion/bin/update_references.lua tool '/home/xin/.nextflow/assets/sanger-pathogens/companion/bin/update_references.lua' not found; option -help lists possible tools xin@compare-vm-1:/analysis/xin/parasite/bin$ ls -l /home/xin/.nextflow/assets/sanger-pathogens/companion/bin/update_references.lua -rwxrwxr-x 1 xin xin 14960 Jun 4 17:17 /home/xin/.nextflow/assets/sanger-pathogens/companion/bin/update_references.lua

gt -version gt (GenomeTools) 0.6.5 (2018-06-04 10:46:00) Copyright (c) 2003-2007 Gordon Gremme gremme@zbh.uni-hamburg.de Copyright (c) 2003-2007 Center for Bioinformatics, University of Hamburg See LICENSE file or http://genometools.org/license.html for license details.

Used compiler: gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11) Compile flags: -Wall -Os -I/homes/xin/Downloads/genometools-0.6.5/src -I/homes/xin/Downloads/genometools-0.6.5/obj

satta commented 6 years ago

Thanks! First of all, please install a newer GenomeTools version, the one you seem to use is quite old and does not support some of the features the Lua script needs to run. GenomeTools is at version 1.5.10 currently (http://genometools.org/pub/)

xinliu005 commented 6 years ago

Many thanks for your support!

xinliu005 commented 6 years ago

@satta I am considering download the reference genomes from ftp://ftp.sanger.ac.uk/pub/project/pathogens/companion/CryptoDB.org/. There is references.json and references-in.json files, but no config file. Can they be directly put into the command line "nextflow run sanger-pathogens/companion" or update_references.lua still need to be run to import the reference data?

satta commented 6 years ago

update_references.lua is only needed to build new references from basic sequence+annotation files. The pre-compiled files should be usable directly by downloading them and using their location in the Companion workflow's config file as the value of the ref_dir variable. For example:

ref_dir = "/path/to/my/companion/CryptoDB.org"
ref_species = "Cryptosporidium_parvum_Iowa_II"

etc.

xinliu005 commented 6 years ago

There is WEIGHT_FILE = "" in the config file. Is this field requested? if yes, how can I get the weight_file for such as Cryptosporidium_parvum_Iowa_II?

satta commented 6 years ago

If you don't have a kinetoplastid genome (or anything else with weird polycistronic stuff) just use the plasmodium weight file.

xinliu005 commented 6 years ago

@satta Got the following error while running companion:

xin@compare-vm-1:/analysis/xin/parasite/bin$ ./nextflow run sanger-pathogens/companion -profile /analysis/xin/parasite/data/companion/CryptoDB.org/Cryptosporidium_parvum_Iowa_II.config.txt
N E X T F L O W  ~  version 0.29.1
Launching `sanger-pathogens/companion` [clever_dalembert] - revision: db91c7dc11 [master]
Unknown configuration profile: '/analysis/xin/parasite/data/companion/CryptoDB.org/Cryptosporidium_parvum_Iowa_II.config.txt'

Is there any problem of the config file? /analysis/xin/parasite/data/companion/CryptoDB.org/Cryptosporidium_parvum_Iowa_II.config.txt was attache: Cryptosporidium_parvum_Iowa_II.config.txt

satta commented 6 years ago

Without looking at the file, shouldn't it be -c and not -profile? Example:

nextflow run -c Cryptosporidium_parvum_Iowa_II.config.txt sanger-pathogens/companion -profile docker
xinliu005 commented 6 years ago

Thanks! But got the following error: .command.stub: line 45: ps: command not found ZOE ERROR (from /usr/lib/snap/snap): error opening file (/usr/share/snap/Zoe/HMM/snap.hmm)

1) ps is available in my server: xin@compare-vm-1:~/.nextflow/assets/sanger-pathogens/companion$ which ps /bin/ps 2) There is NO /usr/share/snap/ in my server, but I am not ROOT, so even I install snap, I can not put it into /usr/share/

Full output is in the following: nextflow run sanger-pathogens/companion -c /analysis/xin/parasite/data/Cryptosporidium_parvum_Iowa_II.config.txt -profile docker N E X T F L O W ~ version 0.30.0 Launching sanger-pathogens/companion [loving_torricelli] - revision: db91c7dc11 [master]

C O M P A N I O N ~ version 1.0.2 query : /analysis/xin/parasite/working_dir/assembly/scaffolds.fasta reference : Cryptosporidium_parvum_Iowa_II reference directory : /analysis/xin/parasite/data/companion/CryptoDB.org WARN: Access to undefined parameter dist_dir -- Initialise it to a default value eg. params.dist_dir = some_value

[warm up] executor > local WARN: The operator first is useless when applied to a value channel which returns a single value by definition -- check channel ncrna_cmindex WARN: Access to undefined parameter TRANSCRIPT_FILE -- Initialise it to a default value eg. params.TRANSCRIPT_FILE = some_value [f3/bf5099] Submitted process > press_ncRNA_cms WARN: The operator first is useless when applied to a value channel which returns a single value by definition -- check channel pseudochr_last_index [84/96dc71] Submitted process > truncate_input_headers [43/f15da4] Submitted process > exonerate_empty_hints [8f/8cf2d2] Submitted process > ratt_make_ref_embl [29/28e63b] Submitted process > transcript_empty_hints [6c/084117] Submitted process > pseudogene_indexing [3f/6b4fbf] Submitted process > make_ref_input_for_orthomcl [df/a31d6e] Submitted process > sanitize_input [11/47014d] Submitted process > merge_hints [9c/5c455f] Submitted process > contiguate_pseudochromosomes [d6/8c6913] Submitted process > predict_tRNA [26/634d2a] Submitted process > make_distribution_seqs [59/5e1a05] Submitted process > run_augustus_contigs [f8/310cfd] Submitted process > run_snap WARN: Access to undefined parameter print_paths -- Initialise it to a default value eg. params.print_paths = some_value [33/a13d6b] Submitted process > run_ratt ERROR ~ Error executing process > 'run_snap'

Caused by: Process run_snap terminated with an error exit status (2)

Command executed:

echo '##gff-version 3' > snap.gff3 snap -gff -quiet snap.hmm pseudo.pseudochr.fasta > snap.tmp snap_gff_to_gff3.lua snap.tmp > snap.tmp.2 if [ -s 1 ]; then gt gff3 -sort -tidy -retainids snap.tmp.2 > snap.gff3; fi

Command exit status: 2

Command output: (empty)

Command error: .command.stub: line 45: ps: command not found ZOE ERROR (from /usr/lib/snap/snap): error opening file (/usr/share/snap/Zoe/HMM/snap.hmm) ZOE library version 2013-02-16

Work dir: /home/xin/.nextflow/assets/sanger-pathogens/companion/work/f8/310cfde2451caac375658810e06b5f

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run

-- Check '.nextflow.log' file for details WARN: Killing pending tasks (2)

xinliu005 commented 6 years ago

For the following steps, which steps are requested for annotate parasite genomes: run_exonerate, run_snap, run_ratt, do_contiguation, do_circos, do_pseudo, make_embl, use_reference, fix_polycistrons, truncate_input_headers?

satta commented 6 years ago

That depends on what you want as output and what features you want in your annotation:

Please keep in mind that some of the options only work when supported by the reference (e.g. do_contiguation) or by some of the tools used in the workflow (e.g. run_snap).

satta commented 6 years ago

As for the ps part: Nextflow jobs are run in a Docker container, so you don't need to have all the dependent software for Companion installed. No idea why SNAP won't run but as I stated above, just disable it for now. There are no pre-generated gene models for Cryptosporidium in the container anyway, it's optimized for kinetoplastids at the moment. Using SNAP would probably need some extra development work and preparation of the correct models.

xinliu005 commented 6 years ago

@satta Many thanks for your reply. Unfortunately some other error appear: Command error: .command.stub: line 45: ps: command not found warning: line 1 in file "stdin" does not begin with "##gff-version" or "##gvf-version", create "##gff-version 3" line automatically gt gff3: error: line 1 in file "stdin" does not contain 9 tab (\t) separated fields

Full output is in the following: N E X T F L O W ~ version 0.30.0 Launching sanger-pathogens/companion [furious_khorana] - revision: db91c7dc11 [master]

C O M P A N I O N ~ version 1.0.2 query : /analysis/xin/parasite/working_dir/assembly/scaffolds.fasta reference : Cryptosporidium_parvum_Iowa_II reference directory : /analysis/xin/parasite/data/companion/CryptoDB.org WARN: Access to undefined parameter dist_dir -- Initialise it to a default value eg. params.dist_dir = some_value

[warm up] executor > local WARN: The operator first is useless when applied to a value channel which returns a single value by definition -- check channel ncrna_cmindex WARN: Access to undefined parameter TRANSCRIPT_FILE -- Initialise it to a default value eg. params.TRANSCRIPT_FILE = some_value [2a/16e905] Submitted process > truncate_input_headers [6a/b6c1d4] Submitted process > press_ncRNA_cms WARN: The into operator should be used to connect two or more target channels -- consider to replace it with .set { integrated_gff3_processed } WARN: The operator first is useless when applied to a value channel which returns a single value by definition -- check channel pseudochr_last_index WARN: The operator first is useless when applied to a value channel which returns a single value by definition -- check channel core_comp_circos_chr WARN: The operator first is useless when applied to a value channel which returns a single value by definition -- check channel core_comp_circos_bin [72/d923fb] Submitted process > exonerate_empty_hints [b2/c73b4e] Submitted process > ratt_make_ref_embl [d2/cc41d4] Submitted process > transcript_empty_hints [18/0c296c] Submitted process > make_empty_snap [fb/ecaf05] Submitted process > pseudogene_indexing [da/bbc4fe] Submitted process > make_ref_input_for_orthomcl [b3/b03478] Submitted process > make_empty_circos_clusters [d6/97a69b] Submitted process > sanitize_input [9b/3998bf] Submitted process > merge_hints [5a/f383f6] Submitted process > contiguate_pseudochromosomes [6b/2acfa2] Submitted process > predict_tRNA [f9/673055] Submitted process > make_distribution_seqs [42/f3900c] Submitted process > run_augustus_contigs WARN: Access to undefined parameter print_paths -- Initialise it to a default value eg. params.print_paths = some_value [1a/c7a5d1] Submitted process > run_ratt [e7/1cb383] Submitted process > pseudogene_last (1) [db/23430c] Submitted process > predict_ncRNA (1) [f9/b71129] Submitted process > run_augustus_pseudo [7a/7c4e7f] Submitted process > pseudogene_last (2) [41/630edd] Submitted process > pseudogene_last (3) [79/90216d] Submitted process > predict_ncRNA (2) [6b/6e8ed1] Submitted process > blast_for_circos [3f/1f0634] Submitted process > ratt_to_gff3 [06/6b3916] Submitted process > merge_genemodels [07/69c9f8] Submitted process > integrate_genemodels [32/219bc3] Submitted process > remove_exons [a3/ff1c95] Submitted process > pseudogene_calling (1) [d1/4c488b] Submitted process > merge_ncrnas (1) [25/1057d1] Submitted process > merge_structural (1) [57/68f989] Submitted process > add_gap_features (1) [b0/ffb5db] Submitted process > split_splice_models_at_gaps (1) [14/7891f7] Submitted process > add_polypeptides (1) ERROR ~ Error executing process > 'add_polypeptides (1)' Caused by: Process add_polypeptides (1) terminated with an error exit status (1)

Command executed:

create_polypeptides.lua input.gff3 "CM0004(%w+)" | gt gff3 -sort -retainids -tidy > output.gff3

Command exit status: 1

Command output: (empty)

Command error: .command.stub: line 45: ps: command not found warning: line 1 in file "stdin" does not begin with "##gff-version" or "##gvf-version", create "##gff-version 3" line automatically gt gff3: error: line 1 in file "stdin" does not contain 9 tab (\t) separated fields

Work dir: /home/xin/work/14/7891f764130fb085fb9af9c0251d8f

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named .command.sh

-- Check '.nextflow.log' file for details

satta commented 6 years ago

Looks like it got much further this time :) Difficult to debug without access to the data though... can you provide the contents of your work directory so I can understand what is the matter with the GFF3? If you want you can only give me /home/xin/work/14/7891f764130fb085fb9af9c0251d8f but please make sure all files symlinked from outside this directory are included.

xinliu005 commented 6 years ago

only 2 files under ~/work/14/7891f764130fb085fb9af9c0251d8f, and one of them is empty, the other file was attached: lrwxrwxrwx 1 xin xin 64 Jun 7 19:01 input.gff3 -> /home/xin/work/b0/ffb5db1c4a5a8d99ba967179826917/merged_out.gff3
-rw-r--r-- 1 xin xin 0 Jun 7 19:01 output.gff3

/home/xin/work/b0/ffb5db1c4a5a8d99ba967179826917/merged_out.gff3 attached: merged_out.gff3.txt

satta commented 6 years ago

OK there are indeed annotations in there. I'll need to redo the part of Companion that failed for you, which will take some time as I am not working full time on Companion anymore.

xinliu005 commented 6 years ago

@satta ">>I'll need to redo the part of Companion that failed for you" How is it going? We are trying to built a parasite analysis pipeline and hopefully can use companion as the tool of annotation. Thanks.

ybdong919 commented 4 years ago

lua5.3 ../../bin/update_references.lua lua5.3: ../../bin/update_references.lua:20: attempt to index a nil value (global 'gt') stack traceback: ../../bin/update_references.lua:20: in main chunk [C]: in ?

ybdong919 commented 4 years ago

when I run "../../bin/update_references.lua", error outputted: " gt: error: could not execute script ../../bin/update_references.lua:268: bad argument #1 to 'pairs' (table expected, got nil) "