mattb112885 / clusterDbAnalysis

ITEP - Integrated Toolkit for Exploration of microbial Pan-genomes
26 stars 15 forks source link

Unknown error running setup_step4.sh #63

Closed jcthrash closed 10 years ago

jcthrash commented 10 years ago

I've been able to get most of the database building steps completed (as far as I can tell) and began running the final step with setup_step4.sh. It successfully downloads and builds the CDD database I believe, but then errors out somewhere during the RPSBLAST. Here are the final lines of the output:

.... 4200K .......... .......... .......... .......... .......... 94% 58.5M 0s 4250K .......... .......... .......... .......... .......... 96% 41.9M 0s 4300K .......... .......... .......... .......... .......... 97% 1.52M 0s 4350K .......... .......... .......... .......... .......... 98% 8.89M 0s 4400K .......... .......... .......... .......... .......... 99% 32.3M 0s 4450K .......... .......... .... 100% 73.1M=1.5s

2014-04-11 11:57:19 (2.86 MB/s) - “cddid.tbl.gz” saved [4581750]

Compiling the CDD RPSBLAST database... rpsblast -query faa/684719.1.txt.faa -db cd_db/Cdd.pn -out rpsblast_res/684719.1.txt.faa_rpsout -outfmt 6 -evalue 1.000000e-05 rpsblast -query faa/54526.4.txt.faa -db cd_db/Cdd.pn -out rpsblast_res/54526.4.txt.faa_rpsout -outfmt 6 -evalue 1.000000e-05 rpsblast -query faa/54526.3.txt.faa -db cd_db/Cdd.pn -out rpsblast_res/54526.3.txt.faa_rpsout -outfmt 6 -evalue 1.000000e-05 rpsblast -query faa/939316.1.txt.faa -db cd_db/Cdd.pn -out rpsblast_res/939316.1.txt.faa_rpsout -outfmt 6 -evalue 1.000000e-05 rpsblast -query faa/939306.1.txt.faa -db cd_db/Cdd.pn -out rpsblast_res/939306.1.txt.faa_rpsout -outfmt 6 -evalue 1.000000e-05 rpsblast -query faa/309857.1.txt.faa -db cd_db/Cdd.pn -out rpsblast_res/309857.1.txt.faa_rpsout -outfmt 6 -evalue 1.000000e-05 rpsblast -query faa/54526.2.txt.faa -db cd_db/Cdd.pn -out rpsblast_res/54526.2.txt.faa_rpsout -outfmt 6 -evalue 1.000000e-05 rpsblast -query faa/980517.1.txt.faa -db cd_db/Cdd.pn -out rpsblast_res/980517.1.txt.faa_rpsout -outfmt 6 -evalue 1.000000e-05 rpsblast -query faa/54526.1.txt.faa -db cd_db/Cdd.pn -out rpsblast_res/54526.1.txt.faa_rpsout -outfmt 6 -evalue 1.000000e-05 rpsblast -query faa/335992.1.txt.faa -db cd_db/Cdd.pn -out rpsblast_res/335992.1.txt.faa_rpsout -outfmt 6 -evalue 1.000000e-05 rpsblast -query faa/744985.1.txt.faa -db cd_db/Cdd.pn -out rpsblast_res/744985.1.txt.faa_rpsout -outfmt 6 -evalue 1.000000e-05 rpsblast -query faa/314261.1.txt.faa -db cd_db/Cdd.pn -out rpsblast_res/314261.1.txt.faa_rpsout -outfmt 6 -evalue 1.000000e-05 rpsblast -query faa/198252.1.txt.faa -db cd_db/Cdd.pn -out rpsblast_res/198252.1.txt.faa_rpsout -outfmt 6 -evalue 1.000000e-05 rpsblast -query faa/439493.1.txt.faa -db cd_db/Cdd.pn -out rpsblast_res/439493.1.txt.faa_rpsout -outfmt 6 -evalue 1.000000e-05 rpsblast -query faa/939324.1.txt.faa -db cd_db/Cdd.pn -out rpsblast_res/939324.1.txt.faa_rpsout -outfmt 6 -evalue 1.000000e-05 rpsblast -query faa/28211.1.txt.faa -db cd_db/Cdd.pn -out rpsblast_res/28211.1.txt.faa_rpsout -outfmt 6 -evalue 1.000000e-05 rpsblast -query faa/939346.1.txt.faa -db cd_db/Cdd.pn -out rpsblast_res/939346.1.txt.faa_rpsout -outfmt 6 -evalue 1.000000e-05 [rpsblast] ERROR: Invalid argument: -query [rpsblast] ERROR: Invalid argument: -query [rpsblast] ERROR: Invalid argument: -query [rpsblast] ERROR: Invalid argument: -query [rpsblast] ERROR: Invalid argument: -query [rpsblast] ERROR: Invalid argument: -query [rpsblast] ERROR: Invalid argument: -query [rpsblast] ERROR: Invalid argument: -query [rpsblast] ERROR: Invalid argument: -query [rpsblast] ERROR: Invalid argument: -query [rpsblast] ERROR: Invalid argument: -query [rpsblast] ERROR: Invalid argument: -query [rpsblast] ERROR: Invalid argument: -query [rpsblast] ERROR: Invalid argument: -query [rpsblast] ERROR: Invalid argument: -query [rpsblast] ERROR: Invalid argument: -query [rpsblast] ERROR: Invalid argument: -query Job = [faa/744985.1.txt.faa, cd_db/Cdd.pn, rpsblast_res/, 1e-05] completed Job = [faa/54526.3.txt.faa, cd_db/Cdd.pn, rpsblast_res/, 1e-05] completed Job = [faa/439493.1.txt.faa, cd_db/Cdd.pn, rpsblast_res/, 1e-05] completed Job = [faa/198252.1.txt.faa, cd_db/Cdd.pn, rpsblast_res/, 1e-05] completed Job = [faa/335992.1.txt.faa, cd_db/Cdd.pn, rpsblast_res/, 1e-05] completed Job = [faa/28211.1.txt.faa, cd_db/Cdd.pn, rpsblast_res/, 1e-05] completed Job = [faa/939346.1.txt.faa, cd_db/Cdd.pn, rpsblast_res/, 1e-05] completed Job = [faa/939324.1.txt.faa, cd_db/Cdd.pn, rpsblast_res/, 1e-05] completed Job = [faa/54526.2.txt.faa, cd_db/Cdd.pn, rpsblast_res/, 1e-05] completed Job = [faa/314261.1.txt.faa, cd_db/Cdd.pn, rpsblast_res/, 1e-05] completed Job = [faa/54526.4.txt.faa, cd_db/Cdd.pn, rpsblast_res/, 1e-05] completed Job = [faa/939316.1.txt.faa, cd_db/Cdd.pn, rpsblast_res/, 1e-05] completed Job = [faa/309857.1.txt.faa, cd_db/Cdd.pn, rpsblast_res/, 1e-05] completed Job = [faa/684719.1.txt.faa, cd_db/Cdd.pn, rpsblast_res/, 1e-05] completed Job = [faa/980517.1.txt.faa, cd_db/Cdd.pn, rpsblast_res/, 1e-05] completed Job = [faa/54526.1.txt.faa, cd_db/Cdd.pn, rpsblast_res/, 1e-05] completed Job = [faa/939306.1.txt.faa, cd_db/Cdd.pn, rpsblast_res/, 1e-05] completed Completed Task = singleRpsBlast cat: rpsblast_res/*: No such file or directory


At this stage, the rpsblast_res/ directory exists, but is empty.

Can you help with this?

mattb112885 commented 10 years ago

Do you have an up to date RPSBLAST? (They changed the syntax relatively recently, I think as of 2.26, but didn't change the name of the program. The setup scripts are meant to be used with the newer version since the NCBI files on the FTP site are compatible with the new one and not the old one). My help text looks like this:

$ rpsblast -h USAGE rpsblast [-h] [-help] [-import_search_strategy filename] [-export_search_strategy filename] [-db database_name] [-dbsize num_letters] [-gilist filename] [-seqidlist filename] [-negative_gilist filename] [-entrez_query entrez_query] [-query input_file] [-out output_file] [-evalue evalue] [-word_size int_value] [-xdrop_ungap float_value] [-xdrop_gap float_value] [-xdrop_gap_final float_value] [-searchsp int_value] [-max_hsps_per_subject int_value] [-seg SEG_options] [-soft_masking soft_masking] [-culling_limit int_value] [-best_hit_overhang float_value] [-best_hit_score_edge float_value] [-window_size int_value] [-lcase_masking] [-query_loc range] [-parse_deflines] [-outfmt format] [-show_gis] [-num_descriptions int_value] [-num_alignments int_value] [-html] [-max_target_seqs num_sequences] [-num_threads int_value] [-remote] [-comp_based_stats compo] [-use_sw_tback] [-version]

DESCRIPTION Reverse Position Specific BLAST 2.2.28+

mattb112885 commented 10 years ago

Also - note that you can do most analysis (just not looking at conserved domain architecture like PFams, COGs) without running the 4th step. So whether you want to take the time to run it depends on what you need to look at.

Matt

jcthrash commented 10 years ago

Sorry for the delay. I appreciate the suggestion with simply not using the downstream features. I'd like to get it all working, however, b/c I think the domain architecture comparisons will be key to understanding many poorly annotated proteins in the group of organisms I'm working on.

I believe you have caught the problem- I have rpsblast v. 2.2.22. I'll upgrade that, try again, and see how that works.

Thank you!!!

jcthrash commented 10 years ago

I've confirmed this did fix the problem- using 2.2.28+ allowed the setup_step4.sh script to run to completion.

mattb112885 commented 10 years ago

Thanks for the confirmation. I think i can add some code in to warn people of this problem when checking for dependencies.