Closed qussai96 closed 2 years ago
it seems like the problem is with the script "proteins_from_gtf.pl" as the seed_proteins.faa file is empty. After checking my seeds file "augustus.hints.gtf" it appears that it has unexpected format. comparing it to your example/input/genemark.gtf file in [https://github.com/gatech-genemark/ProtHint/tree/master/example/input], I found that my seeds file contains gene and transcript predictions, not only exons, introns and CDS. Any ideas on how to solve this issue?
Thanks,
ProtHint should still work with the BRAKER seeds (anything other than the CDS lines are ignored).
Can you share your input files (by email to bruna.tomas@gmail.com if you don't want to share here)? I will try to run ProtHint myself and reproduce this error.
Best, Tomas
Thank you @tomasbruna for your reply. I have sent the input files by email.
best,
Hi Qussai,
thanks for sending the files. The error is caused by a mismatch in contig names between augustus.hints.gtf
and GCF_000001735.4_TAIR10.1_genomic.fna
. One of them has spaces between words:
NC_003070.9 Arabidopsis thaliana chromosome 1 sequence
and the other one uses underscores:
NC_003070.9_Arabidopsis_thaliana_chromosome_1_sequence
It is possible that the underscores were automatically added over the course of a ProtHint run. In any case, the error can be fixed by renaming the contigs to make sure they are matching.
Best, Tomas
Thank you, Tomas! I renamed the contigs and everything is working perfectly now.
cheers, Qussai
Hi Tomas, I am trying to run prothint with seeds generated by braker. I am getting the following error:
I tried to change diamond version in dependencies folder with the latest version but this didn't help, and I am still getting the same error.
The command I am running is:
prothint.py --threads=16 --workdir=ProtHint_dir --geneSeeds=./braker_seeds/augustus.hints.gtf GCF_000001735.4_TAIR10.1_genomic.fna.masked orthodb_without_AT.fasta
Thanks,