Closed marco91sol closed 12 months ago
Hi Marco,
TSEBRA takes as input files that contain gene structures with annotated CDS regions as input for the '--gtf'
option (like the braker.gtf
file), and for the --hintfiles
option it only takes intron, start-/stop-codon positions as input (like the hintsfile.gff
in your BRAKER working directory).
It can't use your protein and transcript alignments directly. You could use the hintsfile.gff
instead, which is created from this data if you used it for your BRAKER run. However, TSEBRA then would only filter your braker.gtf
gene set, since TSEBRA's result is a subset of the input gene sets and you only have one input gene set.
I would recommend that you make two BRAKER runs, one with only your RNA-Seq data, and the other with only the protein data. Then, you could combine the results with TSEBRA using the two BRAKER gene sets and the hints from the working directories.
I hope that helps. Best, Lars
Hi,
I have different gff files, deriving from proteins and transcripts alignment on my genome using exonerate and stringtie. Moreover, I have the result of braker analysis in gtf format. I would like to merge all these files with TSEBRA, but It gives some issues. Attacched, you will find the picture of head gff files and of braker.gtf.
Below, the braker.gtf:
Below the error:
READING GENE PREDICTION: [braker.gtf]
READING EXTRINSIC EVIDENCE: [final_complete_protein_TSEBRA.gff3]
Traceback (most recent call last): File "~/Software/TSEBRA/bin/tsebra.py", line 174, in
main()
File "~/Software/TSEBRA/bin/tsebra.py", line 67, in main
evi.add_hintfile(h)
File "~/Software/TSEBRA/bin/evidence.py", line 115, in add_hintfile
hintfile = Hintfile(path_to_hintfile)
File "~/Software/TSEBRA/bin/evidence.py", line 81, in init
self.read_file(path)
File "~/Software/TSEBRA/bin/evidence.py", line 92, in read_file
if line[0][0] == '#':
IndexError: list index out of range
How can I solve that? Thanks, Marco