erhard-lab / grandR

R package for nucleotide conversion sequencing data analysis
Other
8 stars 2 forks source link

Error with subsetting: Error in m[, columns] : subscript out of bounds #22

Closed iamnicogomez closed 1 year ago

iamnicogomez commented 1 year ago

Florian,

I’ve recently been running into this error when I try to subset my grandR object in R:

> data_4sU_only <-subset(data_all, has.4sU)
Error in m[, columns] : subscript out of bounds

I've been executing the same script to load up my grandR object for the past several weeks and this is the first time I've experienced this issue. I have attached my design object and for completeness, here is the command I am using to load my grandR object:

data_all <- ReadGRAND("/Volumes/SpyFox/grandslam_combined.tsv.gz", design = sample_order, read.percent.conv = TRUE, verbose = TRUE, rename.sample = 'basename')

I’ve tried: 1) coercing the condition back to characters 2) simplifying the “Names” column to the base name of the sample (and using “basename” as the rename.sample argument in ReadGRAND. 3) starting a completely fresh session with NO unnecessary libraries attached 4) running subset.grandr() line by line. In doing so I've localized the error to: data.apply(x,function(m) m[,columns],fun.coldata = function(t){ dr(t[columns,])

I have attached my design object and my grandR object filtered with: FilterGenes(data_all, use = 1:1000) (i changed the extension to .txt for uploading here) data_all_1k.txt sample_order.csv

Thank you, Nico

R version 4.2.2 (2022-10-31) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Monterey 12.6, RStudio 2022.12.0.353

Locale: en_US.UTF-8 / en_US.UTF-8 / en_US.UTF-8 / C / en_US.UTF-8 / en_US.UTF-8

Package version: bitops_1.0.7 cli_3.6.0 colorspace_2.1.0 compiler_4.2.2 cowplot_1.1.1 fansi_1.0.4 farver_2.1.1 ggplot2_3.4.1 glue_1.6.2 grandR_0.2.1 graphics_4.2.2
grDevices_4.2.2 grid_4.2.2 gtable_0.3.1 isoband_0.2.7 labeling_0.4.2 lattice_0.20.45 lfc_0.2.2 lifecycle_1.0.3 magrittr_2.0.3 MASS_7.3.58.3 Matrix_1.5.3
methods_4.2.2 mgcv_1.8.42 minpack.lm_1.2.3 munsell_0.5.0 nlme_3.1.162 numDeriv_2016.8.1.1 parallel_4.2.2 patchwork_1.1.2 pillar_1.8.1 pkgconfig_2.0.3 plyr_1.8.8
R6_2.5.1 RColorBrewer_1.1.3 Rcpp_1.0.10 RCurl_1.98.1.10 reshape2_1.4.4 rlang_1.1.0 scales_1.2.1 splines_4.2.2 stats_4.2.2 stringi_1.7.12 stringr_1.5.0
tibble_3.2.0 tools_4.2.2 utf8_1.2.3 utils_4.2.2 vctrs_0.6.0 viridisLite_0.4.1 withr_2.5.0
<!

iamnicogomez commented 1 year ago

Ah forgot to mention that "Filterkinetics" works as expected. Truly only an issue with subset. Nico

florianerhard commented 1 year ago

Hi Nico, the issue is that you apparently loaded a grand-slam output file, that was generated without the -full parameter, but you forced grandR to load raw conversion counts using read.percent.conv = TRUE. Unfortunately, the checks for that did not work properly. This is now corrected.

If you don't need the percent conversions, just omit this from the call to ReadGRAND(). Otherwise rerun gedi -e Slam but include the -full parameter!

If you want to test the fix, get the development version from github:

require("devtools")
devtools::install_github("erhard-lab/grandR")

and then try to load the grand-slam output (which should not give you a reasonable error message! Best, Florian

iamnicogomez commented 1 year ago

I have removed the read.percent.conv = TRUE argument and get the same error when I try to subset.

Nico

florianerhard commented 1 year ago

Are you sure? If I remove the empty slot from the data structure you provided, it works just nicely:

data_all=readRDS("data_all_1k.txt")
data_all$data$percent_conv=NULL
data_4sU_only <-subset(d, has.4sU)

Best, Florian

iamnicogomez commented 1 year ago

Ah, never mind you were right! Thank you so much Florian. Nico