Open jkniehaus opened 7 months ago
Looks like NAs were getting created in the 'genes' and 'overlapping_genes' columns. the function worked after removing these from the 'overlapping_gene_list' object.
Removing NAs resolved the first problem, but another persists during OverlapResolutions function. Perhaps ReferenceEnhancer requires filtering a gtf to some degree. Directly using one downloaded from Ensembl does not seem to work.
library(ReferenceEnhancer)
library(dplyr)
genome_annotation <- LoadGtf(unoptimized_annotation_path = "Mus_musculus.GRCm39.111.gtf")
genome_annotation <- genome_annotation %>%
mutate(gene_name = coalesce(gene_name, gene_id)) #remove NAs and replace w/ gene_id
gene_overlaps <- IdentifyOverlappers(genome_annotation = genome_annotation)
OverlapResolutions(genome_annotation = genome_annotation, overlap_data = gene_overlaps, gene_pattern = c("Rik$", "^Gm"))
Error in seq.default(from = gene_A_exons[row_exonA, 1], to = gene_A_exons[row_exonA, :
wrong sign in 'by' argument
Hello,
Thanks for the tool. I successfully went through your test files.
I'm trying to generate an optimized annotation for ensembl's latest mm39 annotation and am running into an error during the OverlapResolutions function:
Do gtf files need to be processed or formatted in any way? I'm guessing this error might arise from an NA or something. Any guidance is appreciated (or if you have an optimized mm39 gtf readily available, that'd be great too).
Thanks! Jesse
Code below:
sessionInfo(): R version 4.3.1 (2023-06-16) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Red Hat Enterprise Linux 8.8 (Ootpa)
Matrix products: default BLAS/LAPACK: /nas/longleaf/rhel8/apps/r/4.3.1/lib/libopenblas_zenp-r0.3.23.so; LAPACK version 3.11.0
locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: America/New_York tzcode source: system (glibc)
attached base packages: [1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached): [1] compiler_4.3.1