bg7 / BG7

bacterial genome annotation system
bg7.ohnosequences.com
13 stars 7 forks source link

Current master has issues #22

Open mscook opened 12 years ago

mscook commented 12 years ago

Hi Guys -

In bin/bg7 -

MJSC - Typo here. /jars/ not jar

MJSC - Typo - Should be BG7.jar

ugly hack, executions.xml should be a param

cp $BG7_HOME/jars/BG7.jar $output_folder/

run bg7!

I will add memory conf through the script later on, no time for this now

echo "running bg7 now!" java -d64 -Xmx6G -Xms1G -jar $output_folder/BG7.jar

clean up

rm -f $output_folder/BG7.jar

After fixing those I get this (std.err and std.out are piped separately, sorry) -

java.io.FileNotFoundException: Annotation_test_PredictedGenes.xml (No such file or directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.(FileInputStream.java:138) at java.io.FileReader.(FileReader.java:72) at com.era7.bioinfo.annotation.RemoveDuplicatedGenes.main(RemoveDuplicatedGenes.java:88) at com.era7.bioinfo.annotation.RemoveDuplicatedGenes.execute(RemoveDuplicatedGenes.java:47) at com.era7.lib.bioinfo.bioinfoutil.ExecuteFromFile.main(ExecuteFromFile.java:66) at com.era7.bioinfo.annotation.BG7.main(BG7.java:32) java.io.FileNotFoundException: Annotation_test_NoDuplicates.xml (No such file or directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.(FileInputStream.java:138) at java.io.FileReader.(FileReader.java:72) at com.era7.bioinfo.annotation.SolveOverlappings.main(SolveOverlappings.java:90) at com.era7.bioinfo.annotation.SolveOverlappings.execute(SolveOverlappings.java:49) at com.era7.lib.bioinfo.bioinfoutil.ExecuteFromFile.main(ExecuteFromFile.java:66) at com.era7.bioinfo.annotation.BG7.main(BG7.java:32) Jan 13, 2012 5:36:30 PM com.era7.bioinfo.annotation.GenerateFastaFiles main SEVERE: null java.io.FileNotFoundException: Annotation_test_SolvedOverlaps.xml (No such file or directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.(FileInputStream.java:138) at java.io.FileReader.(FileReader.java:72) at com.era7.bioinfo.annotation.GenerateFastaFiles.main(GenerateFastaFiles.java:86) at com.era7.bioinfo.annotation.GenerateFastaFiles.execute(GenerateFastaFiles.java:58) at com.era7.lib.bioinfo.bioinfoutil.ExecuteFromFile.main(ExecuteFromFile.java:66) at com.era7.bioinfo.annotation.BG7.main(BG7.java:32)

java.io.FileNotFoundException: Annotation_test_SolvedOverlaps.xml (No such file or directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.(FileInputStream.java:138) at java.io.FileReader.(FileReader.java:72) at com.era7.bioinfo.annotation.FillDataFromUniprot.main(FillDataFromUniprot.java:67) at com.era7.bioinfo.annotation.FillDataFromUniprot.execute(FillDataFromUniprot.java:48) at com.era7.lib.bioinfo.bioinfoutil.ExecuteFromFile.main(ExecuteFromFile.java:66) at com.era7.bioinfo.annotation.BG7.main(BG7.java:32)

logging your params to /home/uqmstan1/bg7_test/bg7_example_input_files/out/params.log it looks like this machine has 16 cores, will use that for blast settings creating RNAs blast db RNA blast db created: /home/uqmstan1/bg7_test/bg7_example_input_files/out/Annotation_test.rnas.db if you want, you can check makeblastdb log: /home/uqmstan1/bg7_test/bg7_example_input_files/out/Annotation_test.rnas.db.log running blastn: RNAs vs genome sequence done! will do a basic results check now checking if blast xml output file looks ok... done! everything looks ok creating blast db with genome sequences genome blast db created: /home/uqmstan1/bg7_test/bg7_example_input_files/out/Annotation_test.genome.db you can check makeblastdb log: /home/uqmstan1/bg7_test/bg7_example_input_files/out/Annotation_test.genome.db.log running tblastn: proteins vs genome sequence done! will do a basic results check now checking if blast xml output file looks ok... done! everything looks ok Now I will create the executions.xml file that will drive bg7 execution you can take a look at it, if you want: /home/uqmstan1/bg7_test/bg7_example_input_files/out/executions.xml running bg7 now! This program expects six parameters:

  1. BlastOutput XML filename
  2. Contigs FNA filename
  3. Output results XML filename
  4. Maximum gene length (integer)
  5. Flag (boolean) indicating if this genome corresponds to a virus (true/false)
  6. Dif span (integer)

Contents of my output directory -

Annotation_test_GenBankExternalData.xml Annotation_test.genome.db.log Annotation_test.genome.db.nhr Annotation_test.genome.db.nin Annotation_test.genome.db.nsq Annotation_test_proteins_tBLASTn.xml Annotation_test_ReferenceProteins.fasta Annotation_test_RNA_blastn.xml Annotation_test.rnas.db.log Annotation_test.rnas.db.nhr Annotation_test.rnas.db.nin Annotation_test.rnas.db.nsq Annotation_test_sequences.fna BG7.jar consoleSolapamientos.txt console.txt executions.xml genetic_code.txt params.log

It looks like BG7.jar should be given arguments, but the shell script (bg7 in bin/ does not provide them)

Any ideas?

Cheers

Mitch

eparejatobes commented 12 years ago

Hi Mitch,

sorry for the delay, I'd take a look into this ASAP.

mscook commented 12 years ago

Any updates on this?

Cheers

Mitch

lf3045 commented 11 years ago

Any updates on this?

larsius commented 11 years ago

This is an old issue, still not solved? Got the same trouble, but got a step ahead. The bg7 script creates executions.xml with advice for the next steps, here for PredictGenes.jar there are 5 arguments listed, while the PredictGenes.jar needs indeed 6. The "dif span" is missing, I did not find what it is actually and would really like to know. As well the arguments for maximum gene length and virus flag are set arbitrarily (400 true), not really best when annotating a bacterial genome. I changed all this and still got error with PredictGenes.jar as well when running it separately after FixFastaHeaders and the bg7 script for BLAST. Here is the error message: Reading fna file... java.lang.StringIndexOutOfBoundsExecption: String index out of range: 0 at java.lang.String.charAt(String.java:694) at com.era7.bioinfo.annotation.PredictGenes.main(PredictGenes.java:147)

larsius commented 11 years ago

even with the test dataset of EHEC I got trouble, did FixFastaHeaders and script for BLASTing and corrected the executions.xml for max gene length, virus flag and dif size (30) it still hangs in the PredictGenes part - heres the on-screen output.

java -d64 -Xmx6G -Xms1G -jar bg7.jar Reading fna file... Done!! :) Calculating complementary inverted sequences.... Done! Parsing blastoutput XML file Done! Iterations size: 146293 Iteration sp has 1hits java.lang.ArrayIndexOutOfBoundsException: 1 at com.era7.bioinfo.annotation.PredictGenes.main(PredictGenes.java:208) at com.era7.bioinfo.annotation.PredictGenes.execute(PredictGenes.java:46) at com.era7.lib.bioinfo.bioinfoutil.ExecuteFromFile.main(ExecuteFromFile.java:66) at com.era7.bioinfo.annotation.BG7.main(BG7.java:32) java.io.FileNotFoundException: ehec_PredictedGenes.xml (No such file or directory) [...and going on with missing files errors in the next steps ]

dnatag commented 11 years ago

I got the same ArrayIndexOutOfBoundsException as @larsius

Any update?