4ureliek / TEanalysis

Analysis of TE contribution to features (transcripts or simple features). Includes utils to test enrichment.
MIT License
23 stars 11 forks source link

Issue with the Exst generation table #9

Closed GwenBontonou closed 2 weeks ago

GwenBontonou commented 8 months ago

Hello, I'm trying to run the TE-analysis_Coverage.pl script to then use the suffling script in order to see if specific regions of my genome are TE enriched. I'm using data available for a published genome meaning that my input correspond to a Refseq.gff3 file with only exons/genes/cds info. I also have the RM.out and RMparsed.tab file. However it seems that the script is not able to process properly my input and to generate the ExST.tab needed for subsequent analysis.

Errors start here: --- Getting sequences lengths for X_genomic.fasta -> lengths have been previously calculated (DtakHiC1_genomic.fasta.lengths exists) => extracting Use of uninitialized value $name in concatenation (.) or string at TE-analysis_pipeline.pl line 533. Use of uninitialized value $name in concatenation (.) or string at TE-analysis_pipeline.pl line 1181.

And at the end: --- Printing summary file and concatenating other outputs

All the tables are empty.

Is this link to the input and the lack of information on specific features?

Thanks a lot

4ureliek commented 8 months ago

Hi, TE-analysis_Coverage.pl cannot take gff3 as input, it was written for Repeat Masker outputs (I am guessing you used that gff3 file as input to -in? What did you use for -type??). Also, sounds like your input does not actually contain TE annotations...? If you want to annotate transposable elements, this is definitely not the purpose of this script!