Gaius-Augustus / TSEBRA

TSEBRA: Transcript Selector for BRAKER
47 stars 5 forks source link

TSEBRA for merging several evidences #18

Closed marco91sol closed 12 months ago

marco91sol commented 2 years ago

Hi,

I have different gff files, deriving from proteins and transcripts alignment on my genome using exonerate and stringtie. Moreover, I have the result of braker analysis in gtf format. I would like to merge all these files with TSEBRA, but It gives some issues. Attacched, you will find the picture of head gff files and of braker.gtf.

image Below, the braker.gtf: image

Below the error:

READING GENE PREDICTION: [braker.gtf]

READING EXTRINSIC EVIDENCE: [final_complete_protein_TSEBRA.gff3]

Traceback (most recent call last): File "~/Software/TSEBRA/bin/tsebra.py", line 174, in main() File "~/Software/TSEBRA/bin/tsebra.py", line 67, in main evi.add_hintfile(h) File "~/Software/TSEBRA/bin/evidence.py", line 115, in add_hintfile hintfile = Hintfile(path_to_hintfile) File "~/Software/TSEBRA/bin/evidence.py", line 81, in init self.read_file(path) File "~/Software/TSEBRA/bin/evidence.py", line 92, in read_file if line[0][0] == '#': IndexError: list index out of range

How can I solve that? Thanks, Marco

LarsGab commented 2 years ago

Hi Marco,

TSEBRA takes as input files that contain gene structures with annotated CDS regions as input for the '--gtf' option (like the braker.gtf file), and for the --hintfiles option it only takes intron, start-/stop-codon positions as input (like the hintsfile.gff in your BRAKER working directory). It can't use your protein and transcript alignments directly. You could use the hintsfile.gff instead, which is created from this data if you used it for your BRAKER run. However, TSEBRA then would only filter your braker.gtf gene set, since TSEBRA's result is a subset of the input gene sets and you only have one input gene set. I would recommend that you make two BRAKER runs, one with only your RNA-Seq data, and the other with only the protein data. Then, you could combine the results with TSEBRA using the two BRAKER gene sets and the hints from the working directories.

I hope that helps. Best, Lars