satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.26k stars 904 forks source link

LoadAkoya function from QuPath table #7792

Open JessicaP94 opened 1 year ago

JessicaP94 commented 1 year ago

Hello, I am trying to import a table from QuPath but keeping receiving the following error: Error in r[i1] - r[-length(r):-(length(r) - lag + 1L)] : non-numeric argument to binary operator My script: codex.obj <- LoadAkoya(filename = "~/JP/230913_for_seurat.csv", type = "qupath")

Can you help me to solve this problem?

Thank you

alikhuseynov commented 1 year ago

Could you please show the whole error output and traceback() What does the ReadAkoya() outputs in this case?

JessicaP94 commented 1 year ago

Could you please show the whole error output and traceback() What does the ReadAkoya() outputs in this case?

codex.obj_JP <- LoadAkoya(filename = "~/Jessica-KP/230913_for_seurat.csv", type = "qupath") Error in r[i1] - r[-length(r):-(length(r) - lag + 1L)] : non-numeric argument to binary operator traceback() 13: diff.default(newX[, i], ...) 12: FUN(newX[, i], ...) 11: apply(X = apply(X = coords, MARGIN = 2L, FUN = range), MARGIN = 2L, FUN = diff) 10: mean(x = apply(X = apply(X = coords, MARGIN = 2L, FUN = range), MARGIN = 2L, FUN = diff)) 9: .AutoRadius(coords = coords) 8: radius %||% .AutoRadius(coords = coords) 7: CreateCentroids.default(coords = coords, nsides = nsides, radius = radius, theta = theta) 6: CreateCentroids(coords = coords, nsides = nsides, radius = radius, theta = theta) 5: CreateFOV.data.frame(coords = data$centroids, type = "centroids", key = "fov", assay = assay) 4: CreateFOV(coords = data$centroids, type = "centroids", key = "fov", assay = assay) 3: withCallingHandlers(expr, warning = function(w) if (inherits(w, classes)) tryInvokeRestart("muffleWarning")) 2: suppressWarnings(expr = CreateFOV(coords = data$centroids, type = "centroids", key = "fov", assay = assay)) 1: LoadAkoya(filename = "~/Jessica-KP/230913_for_seurat.csv", type = "qupath")

I cannot show the ReadAkoya() output because it's too long.

Thank you for your help

alikhuseynov commented 1 year ago

I'm not from the dev team though, it's bit difficult to debug. What versions of Seurat and SeuratObject are you using? Does the ReadAkoya() work without an error? if yes, you can try to make the object manually after data is read, and see where is the error occurs.

data <- ReadAkoya(filename = "~/Jessica-KP/230913_for_seurat.csv",
type = "qupath")
data %>% str
coords <- CreateFOV(coords = data$centroids, 
                                            type = "centroids", key = "fov", assay = assay)
colnames(x = data$metadata) <- make.names(names = colnames(x = data$metadata))
obj <- CreateSeuratObject(counts = data$matrix, assay = assay, 
                          meta.data = data$metadata)
coords <- subset(x = coords, cells = Cells(x = obj))
obj[[fov]] <- coords
for (i in setdiff(x = names(x = data), y = c("matrix", "centroids", 
                                             "metadata"))) {
  obj[[i]] <- CreateAssayObject(counts = data[[i]])
}
JessicaP94 commented 1 year ago

I'm not from the dev team though, it's bit difficult to debug. What versions of Seurat and SeuratObject are you using? Does the ReadAkoya() work without an error? if yes, you can try to make the object manually after data is read, and see where is the error occurs.

data <- ReadAkoya(filename = "~/Jessica-KP/230913_for_seurat.csv",
type = "qupath")
data %>% str
coords <- CreateFOV(coords = data$centroids, 
                                            type = "centroids", key = "fov", assay = assay)
colnames(x = data$metadata) <- make.names(names = colnames(x = data$metadata))
obj <- CreateSeuratObject(counts = data$matrix, assay = assay, 
                          meta.data = data$metadata)
coords <- subset(x = coords, cells = Cells(x = obj))
obj[[fov]] <- coords
for (i in setdiff(x = names(x = data), y = c("matrix", "centroids", 
                                             "metadata"))) {
  obj[[i]] <- CreateAssayObject(counts = data[[i]])
}

Look like there is something wrong with the coordinates: data <- ReadAkoya(filename = "~/Jessica-KP/230913_for_seurat.csv", type = "qupath") data %>% str List of 3 $ matrix : chr [1, 1:355693] "9c768238-9eab-4a0b-b1b2-803c89fb6726;PathAnnotationObject;Polygon;2828.6;1064.1;16.1286;14.9206;0.9104;0.9942;5"| truncated "99344927-b48a-4df2-857d-f7d60e2a9117;PathAnnotationObject;Polygon;2814.6;1065.9;59.6967;30.7903;0.7913;0.9471;1"| truncated "af9f6822-187d-4779-a510-b523a09d5324;PathAnnotationObject;Polygon;2823.5;1066.9;23.0493;18.4;0.8555;0.9962;7.22"| truncated "f543833a-afca-4fff-a3c9-17b8f75e515f;PathAnnotationObject;Polygon;2831.3;1068.4;31.3536;20.9569;0.8971;0.9759;7"| truncated ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr "Object ID;Parent;ROI;Centroid X um;Centroid Y um;Nucleus" .. ..$ : chr [1:355693] "1" "2" "3" "4" ... $ centroids:'data.frame': 355693 obs. of 3 variables: ..$ x : chr [1:355693] "9c768238-9eab-4a0b-b1b2-803c89fb6726;PathAnnotationObject;Polygon;2828.6;1064.1;16.1286;14.9206;0.9104;0.9942;5"| truncated "99344927-b48a-4df2-857d-f7d60e2a9117;PathAnnotationObject;Polygon;2814.6;1065.9;59.6967;30.7903;0.7913;0.9471;1"| truncated "af9f6822-187d-4779-a510-b523a09d5324;PathAnnotationObject;Polygon;2823.5;1066.9;23.0493;18.4;0.8555;0.9962;7.22"| truncated "f543833a-afca-4fff-a3c9-17b8f75e515f;PathAnnotationObject;Polygon;2831.3;1068.4;31.3536;20.9569;0.8971;0.9759;7"| truncated ... ..$ y : chr [1:355693] "9c768238-9eab-4a0b-b1b2-803c89fb6726;PathAnnotationObject;Polygon;2828.6;1064.1;16.1286;14.9206;0.9104;0.9942;5"| truncated "99344927-b48a-4df2-857d-f7d60e2a9117;PathAnnotationObject;Polygon;2814.6;1065.9;59.6967;30.7903;0.7913;0.9471;1"| truncated "af9f6822-187d-4779-a510-b523a09d5324;PathAnnotationObject;Polygon;2823.5;1066.9;23.0493;18.4;0.8555;0.9962;7.22"| truncated "f543833a-afca-4fff-a3c9-17b8f75e515f;PathAnnotationObject;Polygon;2831.3;1068.4;31.3536;20.9569;0.8971;0.9759;7"| truncated ... ..$ cell: chr [1:355693] "1" "2" "3" "4" ... $ metadata :'data.frame': 355693 obs. of 0 variables coords <- CreateFOV(coords = data$centroids, type = "centroids", key = "fov", assay = assay) Error in r[i1] - r[-length(r):-(length(r) - lag + 1L)] : non-numeric argument to binary operator colnames(x = data$metadata) <- make.names(names = colnames(x = data$metadata)) obj <- CreateSeuratObject(counts = data$matrix, assay = assay, meta.data = data$metadata) Error in as.vector(x, "character") : cannot coerce type 'closure' to vector of type 'character' In addition: Warning message: In matrix(data = as.numeric(x = x), ncol = nc) : NAs introduced by coercion

alikhuseynov commented 1 year ago

Probably it has something to do with your input file 230913_for_seurat.csv. Did you specify assay variable, like assay = "qupath.test" or some relevant name? I never worked with QuPath, not sure if the data$centroids are in correct format. Also data$metadata is empty or has no columns? Try to create obj first without coords and see if it works: obj <- CreateSeuratObject(counts = data$matrix, assay = "qupath.test", meta.data = data$metadata)

JessicaP94 commented 1 year ago

I did specify assay like assay=qupath.test". However, data$metada gives data frame with 0 columns and 355693 rows. I checked the data and it takes the Object.ID from QuPath table as X and Y coordinates.

alikhuseynov commented 1 year ago

data$metada with 0 cols seems weird. Also data$matrix looks strange, it should be a sparse matrix with expression data I think this kind of coordinate for a single cell centroid "9c768238-9eab-4a0b-b1b2-803c89fb6726;PathAnnotationObject;Polygon;2828.6;1064.1;16.1286;14.9206;0.9104;0.9942;5" is not correct. But again, I never dealt with QuPath data. If the coords in the data$centroids are correct then you should be able to create the FOV. Try to make centroids only, if that doesn't work, then coords are wrong as well. cents <- CreateCentroids(data$centroids) My suggestion is to go back and check on QuPath side how the "230913_for_seurat.csv" was created.

dagiles2 commented 4 months ago

I encountered a similar problem with the LoadAkoya function as it relates to QuPath and want to share my experience/solutions. I welcome any better ways to approach this if others have ideas (or know of an update that fixed this that I am missing). I am surprised more people are not encountering/commenting on this issue.

1) The LoadAkoya function uses ReadAkoya to input data, but there is an error in ReadAkoya where it misinterprets the QuPath output. ReadAkoya is expecting columns with "Cell: Mean ", where is the antigen/fluorophore. QuPath actually outputs a files with columns labelled "Cell: *** mean". I changed the code for the ReadAkoya function to look for these names instead. My solution may not be the most elegent, but it works. I also noted that it drops the first antigen/fluorophore (in this case DAPI), but I don't need to use this data. For those interested in DAPI, the relevant data is still contained in the nucleus metadata ("Nucleus..DAPI.mean").

2) ReadAkoya also renames the columns without any reference to the fluorophore. For example, "Cell: CD45 mean" may become "Cell.5". To overcome this, I initially just skipped the renaming step, but this caused problems with later use of ImageDimPlot which cannot handle names with atypical characters such as semicolons, forward slash, or spaces. As an alternative, I replaced these with periods which it can handle ("Cell..CD45.mean"). This matches the naming scheme the function already uses for the metadata. Of note, this is also important if your antigens contain these characters.

3) The last thing to note is the ImageDimPlot output is rotated 90 degrees counterclockwise from the original QuPath input. I can't figure out why, and it doesn't impact the overall interpretation. Just something to note.

**I should also note that I only made changes the qupath options, not anything to inform or processor if that is of interest to you.

Here are the updated functions I am using:

LoadAkoyaUpdate

LoadAkoyaUpdate <- function (filename, type = c("inform", "processor", "qupath"), 
          fov, assay = "Akoya", ...) 
{
  data <- ReadAkoyaUpdate(filename = filename, type = type)
  coords <- suppressWarnings(expr = CreateFOV(coords = data$centroids, 
                                              type = "centroids", key = "fov", assay = assay))
  colnames(x = data$metadata) <- suppressWarnings(expr = make.names(names = colnames(x = data$metadata)))
  obj <- CreateSeuratObject(counts = data$matrix, assay = assay, 
                            meta.data = data$metadata)
  coords <- subset(x = coords, cells = Cells(x = obj))
  suppressWarnings(expr = obj[[fov]] <- coords)
  for (i in setdiff(x = names(x = data), y = c("matrix", "centroids", 
                                               "metadata"))) {
    suppressWarnings(expr = obj[[i]] <- CreateAssayObject(counts = data[[i]]))
  }
  return(obj)
}

ReadAkoyaUpdate

ReadAkoyaUpdate <- function (filename, type = c("inform", "processor", "qupath"), 
          filter = "DAPI|Blank|Empty", inform.quant = c("mean", "total", 
                                                        "min", "max", "std")) 
{
  if (!requireNamespace("data.table", quietly = TRUE)) {
    stop("Please install 'data.table' for this function")
  }
  if (!file.exists(filename)) {
    stop(paste("Can't file file:", filename))
  }
  type <- tolower(x = type[1L])
  type <- match.arg(arg = type)
  ratio <- getOption(x = "Seurat.input.sparse_ratio", default = 0.4)
  p <- progressr::progressor()
  p(message = "Preloading Akoya matrix", class = "sticky", 
    amount = 0)
  sep <- switch(EXPR = type, inform = "\t", ",")
  mtx <- data.table::fread(file = filename, sep = sep, data.table = FALSE, 
                           verbose = FALSE)
  p(message = paste0("Parsing matrix in '", type, "' format"), 
    class = "sticky", amount = 0)
  outs <- switch(EXPR = type, processor = {
    p(message = "Creating centroids coordinates", class = "sticky", 
      amount = 0)
    centroids <- data.frame(x = mtx[["x:x"]], y = mtx[["y:y"]], 
                            cell = as.character(x = mtx[["cell_id:cell_id"]]), 
                            stringsAsFactors = FALSE)
    rownames(x = mtx) <- as.character(x = mtx[["cell_id:cell_id"]])
    p(message = "Creating meta data", class = "sticky", 
      amount = 0)
    md <- mtx[, !grepl(pattern = "^cyc", x = colnames(x = mtx)), 
              drop = FALSE]
    colnames(x = md) <- vapply(X = strsplit(x = colnames(x = md), 
                                            split = ":"), FUN = "[[", FUN.VALUE = character(length = 1L), 
                               2L)
    p(message = "Creating expression matrix", class = "sticky", 
      amount = 0)
    mtx <- mtx[, grepl(pattern = "^cyc", x = colnames(x = mtx)), 
               drop = FALSE]
    colnames(x = mtx) <- vapply(X = strsplit(x = colnames(x = mtx), 
                                             split = ":"), FUN = "[[", FUN.VALUE = character(length = 1L), 
                                2L)
    if (!is.na(x = filter)) {
      p(message = paste0("Filtering features with pattern '", 
                         filter, "'"), class = "sticky", amount = 0)
      mtx <- mtx[, !grepl(pattern = filter, x = colnames(x = mtx)), 
                 drop = FALSE]
    }
    mtx <- t(x = mtx)
    if ((sum(mtx == 0)/length(x = mtx)) > ratio) {
      p(message = "Converting expression to sparse matrix", 
        class = "sticky", amount = 0)
      mtx <- as.sparse(x = mtx)
    }
    list(matrix = mtx, centroids = centroids, metadata = md)
  }, inform = {
    inform.quant <- tolower(x = inform.quant[1L])
    inform.quant <- match.arg(arg = inform.quant)
    expr.key <- c(mean = "Mean", total = "Total", min = "Min", 
                  max = "Max", std = "Std Dev")[inform.quant]
    expr.pattern <- "\\(Normalized Counts, Total Weighting\\)"
    rownames(x = mtx) <- mtx[["Cell ID"]]
    mtx <- mtx[, setdiff(x = colnames(x = mtx), y = "Cell ID"), 
               drop = FALSE]
    p(message = "Creating centroids coordinates", class = "sticky", 
      amount = 0)
    centroids <- data.frame(x = mtx[["Cell X Position"]], 
                            y = mtx[["Cell Y Position"]], cell = rownames(x = mtx), 
                            stringsAsFactors = FALSE)
    p(message = "Creating meta data", class = "sticky", 
      amount = 0)
    cols <- setdiff(x = grep(pattern = expr.pattern, x = colnames(x = mtx), 
                             value = TRUE, invert = TRUE), y = paste("Cell", 
                                                                     c("X", "Y"), "Position"))
    md <- mtx[, cols, drop = FALSE]
    exprs <- data.frame(cols = grep(pattern = paste(expr.key, 
                                                    expr.pattern), x = colnames(x = mtx), value = TRUE))
    exprs$feature <- vapply(X = trimws(x = gsub(pattern = paste(expr.key, 
                                                                expr.pattern), replacement = "", x = exprs$cols)), 
                            FUN = function(x) {
                              x <- unlist(x = strsplit(x = x, split = " "))
                              x <- x[length(x = x)]
                              return(gsub(pattern = "\\(|\\)", replacement = "", 
                                          x = x))
                            }, FUN.VALUE = character(length = 1L))
    exprs$class <- tolower(x = vapply(X = strsplit(x = exprs$cols, 
                                                   split = " "), FUN = "[[", FUN.VALUE = character(length = 1L), 
                                      1L))
    classes <- unique(x = exprs$class)
    outs <- vector(mode = "list", length = length(x = classes) + 
                     2L)
    names(x = outs) <- c("matrix", "centroids", "metadata", 
                         setdiff(x = classes, y = "entire"))
    outs$centroids <- centroids
    outs$metadata <- md
    for (i in classes) {
      p(message = paste("Creating", switch(EXPR = i, entire = "entire cell", 
                                           i), "expression matrix"), class = "sticky", 
        amount = 0)
      df <- exprs[exprs$class == i, , drop = FALSE]
      expr <- mtx[, df$cols]
      colnames(x = expr) <- df$feature
      if (!is.na(x = filter)) {
        p(message = paste0("Filtering features with pattern '", 
                           filter, "'"), class = "sticky", amount = 0)
        expr <- expr[, !grepl(pattern = filter, x = colnames(x = expr)), 
                     drop = FALSE]
      }
      expr <- t(x = expr)
      if ((sum(expr == 0, na.rm = TRUE)/length(x = expr)) > 
          ratio) {
        p(message = paste("Converting", switch(EXPR = i, 
                                               entire = "entire cell", i), "expression to sparse matrix"), 
          class = "sticky", amount = 0)
        expr <- as.sparse(x = expr)
      }
      outs[[switch(EXPR = i, entire = "matrix", i)]] <- expr
    }
    outs
  }, qupath = {
    rownames(x = mtx) <- as.character(x = seq_len(length.out = nrow(x = mtx)))
    p(message = "Creating centroids coordinates", class = "sticky", 
      amount = 0)
    xpos <- sort(x = grep(pattern = "Centroid X", x = colnames(x = mtx), 
                          value = TRUE), decreasing = TRUE)[1L]
    ypos <- sort(x = grep(pattern = "Centroid Y", x = colnames(x = mtx), 
                          value = TRUE), decreasing = TRUE)[1L]
    centroids <- data.frame(x = mtx[[xpos]], y = mtx[[ypos]], 
                            cell = rownames(x = mtx), stringsAsFactors = FALSE)
    p(message = "Creating meta data", class = "sticky", 
      amount = 0)
    cols <- setdiff(x = grep(pattern = "Cell:.*mean", x = colnames(x = mtx),
                             ignore.case = TRUE, value = TRUE, invert = TRUE),
                    y = c(xpos, ypos))
    md <- mtx[, cols, drop = FALSE]
    p(message = "Creating expression matrix", class = "sticky", 
      amount = 0)
    idx <- which(x = grepl(pattern = "Cell:.*mean", x = colnames(x = mtx), 
                           ignore.case = TRUE))
    mtx <- mtx[, idx, drop = FALSE]
    colnames(x = mtx) <- gsub(":| |/", ".", colnames(x = mtx))
    if (!is.na(x = filter)) {
      p(message = paste0("Filtering features with pattern '", 
                         filter, "'"), class = "sticky", amount = 0)
      mtx <- mtx[, !grepl(pattern = filter, x = colnames(x = mtx)), 
                 drop = FALSE]
    }
    mtx <- t(x = mtx)
    if ((sum(mtx == 0)/length(x = mtx)) > ratio) {
      p(message = "Converting expression to sparse matrix", 
        class = "sticky", amount = 0)
      mtx <- as.sparse(x = mtx)
    }
    list(matrix = mtx, centroids = centroids, metadata = md)
  }, stop("Unknown matrix type: ", type))
  return(outs)
}
denvercal1234GitHub commented 2 months ago

@dagiles2 - Is it possible you could share the csv exported from QuPath that work with your script above? I was hoping to manually modify the Qupath csv column names accordingly. Thank you!!

Screenshot 2024-07-10 at 17 44 46
dagiles2 commented 2 months ago

Here is an example of the QuPath csv output you can use to test. Test.csv

Two things to note: