Open tania-k opened 6 months ago
It seems that tbl2asn
failed in the update step (which creates the genbank file from the tbl
annotation file). From what you have here, I can't see why it would have failed, especially if it worked in the predict step as you suggested.
When you passed this to annotate
, -i $OUTDIR/$BASE
, is that this folder (annotate/Histoplasma_capsulatum_1371NJ)?
The script is indeed dying because there is no genbank file, although perhaps this shouldn't be a hard stop. The scripts could generate the necessary output in a different way (and maybe it should).
Ideally you'd like to figure out why tbl2asn failed, you could run that step manually and see if you get an error. Alternative work around is just to run annotate
with the FASTA + GFF3 files from update
and provide a new output directory, this will regenerate the files it needs and will not rely on the genbank input.
Hi John, Thanks for taking a look. I noticed in the log file for update that I had the full script to run tbl2asn. I took that and ran it interactively and received a couple of errors.
I think what happened in between predict (where this script worked and created a gbk file), to the update step was an update to python. I was receiving some errors that looked like python errors, and online suggestions were to upgrade or downgrade my python. Which allowed me to progress with the update step but with some key issues.
I think what I am seeing here by adding the quotation marks for "Histoplasma capsulatum" and -t "-l paired-ends" is probably just a syntax issue as I was adjusting the python version. Either way, figured it out and will be re-running my analysis.
Thank you!!
(funannotate) [taniak@longleaf-login5 update_results]$ funannotate util tbl2gbk -i Histoplasma_capsulatum_1371NJ.tbl -f Histoplasma_capsulatum_1371NJ.scaffolds.fa -s Histoplasma capsulatum --strain 1371NJ --tbl2asn -l paired-ends -o Histoplasma_capsulatum_1371NJ
usage: gbk2parts.py [-h] -i TBL -f FASTA -s SPECIES [--isolate ISOLATE] [--strain STRAIN] [-t TBL2ASN] [--sbt SBT] [-o OUTPUT]
gbk2parts.py: error: argument -t/--tbl2asn: expected one argument
(funannotate) [taniak@longleaf-login5 update_results]$ funannotate util tbl2gbk -i Histoplasma_capsulatum_1371NJ.tbl -f Histoplasma_capsulatum_1371NJ.scaffolds.fa -s Histoplasma capsulatum --strain 1371NJ --tbl2asn "-l paired-ends" -o Histoplasma_capsulatum_1371NJ
usage: gbk2parts.py [-h] -i TBL -f FASTA -s SPECIES [--isolate ISOLATE] [--strain STRAIN] [-t TBL2ASN] [--sbt SBT] [-o OUTPUT]
gbk2parts.py: error: unrecognized arguments: capsulatum
(funannotate) [taniak@longleaf-login5 update_results]$ funannotate util tbl2gbk -i Histoplasma_capsulatum_1371NJ.tbl -f Histoplasma_capsulatum_1371NJ.scaffolds.fa -s "Histoplasma capsulatum" --strain 1371NJ --tbl2asn "-l paired-ends" -o Histoplasma_capsulatum_1371NJ
There are 2 gene models that need to be fixed.
-------------------------------------------------------
1371NJ_002402 2 internal stops. Genetic code [1]
(funannotate) [taniak@longleaf-login5 update_results]$ ll
total 86860
-rw-r--r-- 1 taniak rc_matutelb_psx 3230200 Nov 24 16:41 Histoplasma_capsulatum_1371NJ.cds-transcripts.fa
-rw-r--r-- 1 taniak rc_matutelb_psx 0 Nov 24 16:40 Histoplasma_capsulatum_1371NJ.discrepency.report.txt
-rw-r--r-- 1 taniak rc_matutelb_psx 1078529 Dec 12 11:30 Histoplasma_capsulatum_1371NJ.discrepency.txt
-rw-r--r-- 1 taniak rc_matutelb_psx 0 Nov 24 16:40 Histoplasma_capsulatum_1371NJ.error.summary.txt
-rw-r--r-- 1 taniak rc_matutelb_psx 24470243 Dec 12 11:30 Histoplasma_capsulatum_1371NJ.gbk
-rw-r--r-- 1 taniak rc_matutelb_psx 2241607 Nov 24 16:40 Histoplasma_capsulatum_1371NJ.gff3
-rw-r--r-- 1 taniak rc_matutelb_psx 3869464 Nov 24 16:41 Histoplasma_capsulatum_1371NJ.mrna-transcripts.fa
-rw-r--r-- 1 taniak rc_matutelb_psx 372076 Nov 24 16:41 Histoplasma_capsulatum_1371NJ.pasa-reannotation.changes.txt
-rw-r--r-- 1 taniak rc_matutelb_psx 1151423 Nov 24 16:41 Histoplasma_capsulatum_1371NJ.proteins.fa
-rw-r--r-- 1 taniak rc_matutelb_psx 15305820 Nov 24 16:41 Histoplasma_capsulatum_1371NJ.scaffolds.fa
-rw-r--r-- 1 taniak rc_matutelb_psx 35851321 Dec 12 11:30 Histoplasma_capsulatum_1371NJ.sqn
-rw-r--r-- 1 taniak rc_matutelb_psx 1614 Nov 24 16:41 Histoplasma_capsulatum_1371NJ.stats.json
-rw-r--r-- 1 taniak rc_matutelb_psx 1368571 Nov 24 16:40 Histoplasma_capsulatum_1371NJ.tbl
drwxr-sr-x 2 taniak rc_matutelb_psx 4096 Dec 12 11:30 Histoplasma_capsulatum_1371NJ_tmp
-rw-r--r-- 1 taniak rc_matutelb_psx 0 Nov 24 16:40 Histoplasma_capsulatum_1371NJ.validation.txt
-rw-r--r-- 1 taniak rc_matutelb_psx 5 Nov 24 16:41 WGS_accession.txt
I just manually delete genes with these issues usually. did you work around?
Funannotate version = funannotate v1.8.16
Hello Funannotate folks, Thanks for producing a wonderful program! I am currently experiencing an issue that I am unsure how to solve. I have progressed through the steps in Funannotate, using mask, train, predict, update, antismash, and iprscan, and am now in the final annotate step but I am receiving an issue running it.
The way I am running the script is: With a lot of variables.
The error appears as Histoplasma_capsulatum_1371NJ
As I look through my other runs, I noticed that my update step ran just fine in the log files, but it did not produce all the files it usually does (specifically gbk file). I understand that to run this step it would be pulling from the update_results folder. I had the antismash step run pull from the predict_results folder instead as the gbk file was not produced, but I want to understand why, and make sure the 5' and 3' UTRs are detected here by re-running my analysis.
The GBK file is empty while the rest of my files are there. Looking at my update log file.
The more verbose logfile looks like.
Any suggestions of where else to look, or if I am missing something, please let me know.
Thank you for your time.