databio / GenomicDistributions

Calculate and plot distributions of genomic ranges
http://code.databio.org/GenomicDistributions
Other
25 stars 10 forks source link

getTssFromGTF #209

Closed Shukla04 closed 2 months ago

Shukla04 commented 8 months ago

Hello, I want to eventually use the package for my dataset, however, ii cannot get pass step 1. I have an issue while using custom reference. I apologise for not being so much familiar with bioinformatics. Any help would really be appreciated.

the command and the error is listed below: AtTSSs = getTssFromGTF("../annotation.gff", convertEnsemblUCSC=TRUE) Got local file: ../annotation.gff Error in dplyr::filter(): ℹ In argument: gene_biotype == "protein_coding". Caused by error: ! object 'gene_biotype' not found Run rlang::last_trace() to see where the error occurred.

kkupkova commented 8 months ago

Hi!

It seems like your GTF file does not have information on "gene_biotype". If that is the case you should try following command with protein coding filter off: AtTSSs = getTssFromGTF("../annotation.gff", convertEnsemblUCSC=TRUE, filterProteinCoding=TRUE)

If that is not the issue, could you please point me to your GTF file?

Shukla04 commented 8 months ago

@kkupkova Thank you for a reply. I tried to run again with your suggestion but again in vain. I tried to run getTssFromGTF using gff file as well as gtf file. The gtf file was generated from gff in R using package "rtracklayer". All the commands are as follows:

loading the required packages

library(rtracklayer) library(GenomicDistributions)

Importing gff and exporting gtf:

AtTair<-import("annotation.gff") export(AtTair, "AtTair.gtf", "gtf")

Extracting TSS sites from GTF:

AtTaitTSSs = getTssFromGTF("AtTair.gtf", convertEnsemblUCSC=TRUE, filterProteinCoding=TRUE)

The error that follows is: Got local file: AtTair.gtf Error in dplyr::filter(): ℹ In argument: gene_biotype == "protein_coding". Caused by error: ! object 'gene_biotype' not found Run rlang::last_trace() to see where the error occurred.

For more details about the file, I am attaching the screenshots for head and tail of the file. I do not know how to point you towards my GTF file. the original GFF file was downloaded from TAIR10 database.

GTF_AtTAIR

Any help would really be appreciated. Thanks

kkupkova commented 8 months ago

Oh I am so sorry, I was going to recommend setting the argument to FALSE to overcome this issue and somehow I kept the original settings. Try following please: AtTSSs = getTssFromGTF("../annotation.gff", convertEnsemblUCSC=TRUE, filterProteinCoding=FALSE)