ludvigla / semla

Other
56 stars 5 forks source link

Feature request - barcode level metadata during loading #5

Closed williamsdrake closed 10 months ago

williamsdrake commented 1 year ago

Hi,

Really like the package so far. I'm currently analyzing 10X visium data. It would be nice if there was an option to specify a two column CSV file that contains a list of spot barcodes in Col1 and the corresponding pathology annotation to which it belongs in Col2 (the output of the steps outlined here). The csv location could be included in the infoTable, since each sample would have its own csv file. Currently only sample level metadata is supported in the ReadVisiumData function.

ludvigla commented 1 year ago

Hi,

I have included a new feature that can be used to load annotations with ReadVisiumData. You will need to install the dev version of semla to use this feature. An example is provided in the function reference, but I'll include it here too:

samples <-
  Sys.glob(paths = paste0(system.file("extdata", package = "semla"),
                          "/*/filtered_feature_bc_matrix.h5"))
imgs <-
  Sys.glob(paths = paste0(system.file("extdata", package = "semla"),
                          "/*/spatial/tissue_lowres_image.jpg"))
spotfiles <-
  Sys.glob(paths = paste0(system.file("extdata", package = "semla"),
                          "/*/spatial/tissue_positions_list.csv"))
json <-
  Sys.glob(paths = paste0(system.file("extdata", package = "semla"),
                          "/*/spatial/scalefactors_json.json"))
annotation_file <- 
  Sys.glob(paths = paste0(system.file("extdata", package = "semla"),
                          "/*/galt_spots.csv"))
annotation_files <- c(NA_character_, annotation_file)

# Create a tibble/data.frame with file paths
library(tibble)
infoTable <- tibble(samples, imgs, spotfiles, json, sample_id = c("mousebrain", "mousecolon"), annotation_files)

se <- ReadVisiumData(infoTable)

Let me know what you think.

/Ludvig

williamsdrake commented 1 year ago

I've updated to the dev version, but now am getting an error message. This message occurs no matter if I include annotation metadata, sample level metadata or no metadata at all. Here's an example of the code, error and first line in my table. Previously the ReadVisiumData function worked using this table (without the last column).

> combined <- ReadVisiumData(infoTable)

── Reading 10x Visium data ──

ℹ Loading matrices: → Finished loading expression matrix 1 → Finished loading expression matrix 2 → Finished loading expression matrix 3 → Finished loading expression matrix 4 → Finished loading expression matrix 5 → Finished loading expression matrix 6 → Finished loading expression matrix 7 → Finished loading expression matrix 8 → Finished loading expression matrix 9 → Finished loading expression matrix 10 → Finished loading expression matrix 11

ℹ Merging matrices: ✔ There are 36601 features and 3161 spots in the merged matrix. ℹ Loading coordinates: Error in type.convert.default(data[[i]], as.is = as.is[i], dec = dec, : invalid multibyte string at '<89>PNG

1 | /A1/filtered_feature_bc_matrix.h5 | /A1/spatial/tissue_positions_list.csv | /A1/spatial/tissue_hires_image.png | /A1/spatial/scalefactors_json.json | A1 | A | P | 1 | /A1/Path_annotations.csv -- | -- | -- | -- | -- | -- | -- | -- | -- | --
williamsdrake commented 1 year ago

I also found a function on stackoverflow to check for multibyte characters, but doesn't seem to find any in my table

find_offending_character <- function(x, maxStringLength=256){  
  for (c in 1:maxStringLength){
    offendingChar <- substr(x,c,c)
    #the next character is the offending multibyte Character
  }    
}

lapply(infoTable, find_offending_character)
$samples
NULL

$imgs
NULL

$spotfiles
NULL

$json
NULL

$orig.ident
NULL

$section
NULL

$status
NULL

$run
NULL

$annotation_file
NULL
lfranzen commented 1 year ago

Hi @williamsdrake – apologies for the slow response! Do you still have an issue loading your data and metadata? If not it would be helpful for us is you could share more information and for us to have a closer look at your data and infoTable object.

/Lovisa