mhammell-laboratory / TEtranscripts

A package for including transposable elements in differential enrichment analysis of sequencing datasets.
http://hammelllab.labsites.cshl.edu/software/#TEtranscripts
GNU General Public License v3.0
206 stars 29 forks source link

Input file #165

Closed LIUXING-bio closed 5 months ago

LIUXING-bio commented 6 months ago

Hi sir: What kind of input file should it be? Is it possible for mRNA_seq?

olivertam commented 6 months ago

Hi,

Thank you for your interest in the software. TEtranscripts uses BAM files as input, and it is designed for bulk RNA-seq (total or mRNA). It is not designed for single-cell or single-nuclei RNA seq. You will also need a gene annotation GTF (typically obtained from Refseq, GENCODE or Ensembl), and a TE GTF (which we have generated for many genome builds) Please let us know if this does not address your question.

Thanks

github-actions[bot] commented 5 months ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

666lixiaona commented 4 months ago

Hi,

For TE GTF (which we have generated for many [genome builds] your platform provide ,do they have the differences? Where can I search a BAM profile as a reference?

Thanks.

olivertam commented 4 months ago

Hi,

Could you clarify your question? Do you mean "What are the differences between the genome builds?"

Thanks.

666lixiaona commented 4 months ago

HI,

屏幕截图 2024-02-14 201003

I want to know whether these profiles have the difference if I use them.

Thanks

olivertam commented 4 months ago

Hi,

If you are referring to the GTF from the same species (e.g. hg38, GRCh38_Ensembl and GRCh38_GENCODE), the difference is in the chromosome name for each of them. UCSC (hg38) has a chr in front of all their chromosome names (including scaffolds), while Ensembl (GRCh38_Ensembl) does not (e.g. chr1 in hg38 is 1 in GRCh38). Since our program (and most other genome interval tools) are trying to match chromosomes by their name, you need to use the TE GTF that has the corresponding chromosome name/ID. GENCODE is a mixed one, where canonical chromsomes has chr in front (e.g. chr1), but the scaffolds do not (and follow the Ensembl) nomenclature. Thus, if you're using GENCODE FASTA and gene GTF, you would need the GENCODE TE GTF for the quantification to work correctly.

Thanks.

666lixiaona commented 4 months ago

Thank you very much.