wu-lab-egio / EGIO

Exon Group Ideogram based detection of Orthologous exons and Orthologous isoforms
8 stars 0 forks source link

How can I perform the analysis using custom gtf files, mRNA and cds sequences? #4

Closed zebrafish-507 closed 4 months ago

wu-lab-egio commented 2 years ago

Hi, to apply EGIO in the analysis using your own files, the mRNA and CDS sequences should be organized to fasta format. Gtf files should follow the standard gtf format and provide some necessary annotation of the gene or transcript information, for example "gene_id", "gene_name", "gene_biotype" for gene and "transcript_id", "transcript_biotype" for transcript. In fact, the test files in https://github.com/wu-lab-egio/EGIO_example_source are actually custom files, you can refer to these files to organize your own files.

If the gene or transcript annotation information in your own gtf files are marked by other words, please change the correct words in __prepare_egio_extra.py script line 240-243 and line 277-278.