TobiTekath / DTUrtle

Perform differential transcript usage (DTU) analysis of bulk or single-cell RNA-seq data. See documentation at:
https://tobitekath.github.io/DTUrtle
GNU General Public License v3.0
17 stars 3 forks source link

plot_transcripts_view() error: subscript contains invalid names #10

Closed skudashev closed 1 year ago

skudashev commented 1 year ago

Describe the bug Hello, thank you for the amazing tool. I am running DTUrtle with own transcriptome. All other analysis commands and visualisation worked, apart from plot_transcripts_view(). Error: subscript contains invalid names The gene ids in the .gtf match the gene ids in the dturtle object, but there are no gene or transcript names in the file.

To Reproduce gtf <- import_gtf("GTF_PATH", feature_type=NULL, out_df=FALSE) plot_transcripts_view(dturtle = dturtle, genes = "ENSG00000151067", gtf = gtf, genome = NULL, one_to_one = TRUE)

Please complete the following information:

TobiTekath commented 1 year ago

Hi @kudasonya,

thank you for reaching out. I am sorry for the problems you experienced.

My best guess why the error occurs, would be that your gtf file does not contain a "gene_id" or "gene_name" field in the metadata.

You can confirm this by quickly checking if the following columns in the GTF-object exist:

gtf@elementMetadata$gene_id
gtf@elementMetadata$gene_name

If not, I am afraid you have to add at least some info into these two columns (DTUrtle requires both columns to be present - in hindsight not a good design decision 😅)

Somewhere in your gtf's metadata, the ENSEMBL-IDs have to be stored. Simply copy the IDs to a new column called "gene_id" and create a dummy "gene_name" column - or vice versa. Just make sure both columns do exist in the gtf metadata and one of them contains the identifiers used in the DTU analysis.

gtf@elementMetadata$gene_id <- gtf@elementMetadata$my_actual_gene_ids

# if gtf@elementMetadata$gene_name does not exist, set a dummy
gtf@elementMetadata$gene_name <- ""

With the next update, I will surely patch this behaviour. Thank you for spotting.

I hope this helps to resolve your problems.

Best, Tobias

skudashev commented 1 year ago

Hello Tobias,

Thank you for a quick response. So I already had a gene_id column, but no gene_name. I tried creating a dummy column with gtf@elementMetadata$gene_name <- "" and gtf@elementMetadata$gene_name <- make.unique(gtf@elementMetadata$gene_name) but am still getting the same error. Could it be because the dturtle object doesn't contain gene_names?

head(dturtle$meta_table_gene) gene exp_in exp_in_Adult exp_in_Fetal seqnames strand source type ENSG00000001167.15 ENSG00000001167.15 1.0000000 1.0000000 1 chr6 + PacBio transcript head(dturtle$meta_table_tx) gene tx exp_in exp_in_Adult exp_in_Fetal seqnames start end width strand source type G69879.4 ENSG00000001167.15 G69879.4 0.4705882 0.30769231 1.00 chr6 41072615 41099961 27347 + PacBio transcript

Best wishes, Sofia

skudashev commented 1 year ago

Hello Tobias,

Apologies, the issue was simple, I just had to add a dummy transcript_name column to the metadata as well.

Thank you! Sofia