SydneyBioX / spicyR

https://sydneybiox.github.io/spicyR/
8 stars 4 forks source link

Using spicy() on data.frames #40

Closed sarahsamorodnitsky closed 3 months ago

sarahsamorodnitsky commented 7 months ago

Hi there,

I am trying to implement spicy() on some simulated data. The documentation suggests spicy() can be applied to a data.frame. However, when I try to run the function on a data.frame I get the following error:

Error in spicy(images, subject = "PID", condition = "out", from = "a", : cells needs to be a SegmentedCells object

I am also having trouble converting my data.frame (called images) to a SegmentedCells object. It seems like there is a problem with the column naming. Would it be possible to provide an example of converting a data.frame to a SegmentedCells object to prepare it for running spicy()?

I've copied below a minimally-reproducible example. Thank you for your help!

All the best,

Sarah

images <- data.frame(PID = numeric(),  # subject-level ID
                      image.id = character(), # image-level ID in subject
                      cell.id = numeric(), # cell id within image
                      x = numeric(), # x coordinates for cell
                      y = numeric(), # y coordinates for cell
                      type = character(), # cell type
                      out = numeric()) # binary sample-level outcome

for (i in 1:100) {

      # Simulate number of cells
      n.cells <- sample(50:100, 1)

      # Simulate locations
      x <- runif(n.cells, 0, 1000)
      y <- runif(n.cells, 0, 1000)

      # Simulate sample-level outcome
      out <- rbinom(1, 1, 0.5)

      # Combine into data.frame
      image.i <- data.frame(PID = i, 
                                            image.id = paste0("sample.", i, ".image.1"),
                                            cell.id = 1:n.cells, x = x, y = y,
                                            type = "a", # all cells have the same type
                                            out = out)

      # Add to full data.frame
      images <- rbind.data.frame(images, image.i)
}

# Convert to Segmented Cells object
  images.sc <- SegmentedCells(
    cellData = images,
    spatialCoords = c("x", "y"),
    cellTypeString = "type",
    phenotypeString = "out",
    cellIDString = "cell.id",
    imageCellIDString = "cell.id",
    imageIDString = "image.id"
  )

# Run spicy
spicy(
    images.sc,
    subject = "PID",
    condition = "out",
    from = "a",
    to = "a",
    imageID = "image.id"
  )
edridgedsouza commented 3 months ago

Second this; I would wager that most people will be using the dataframe version rather than the SegmentedCells version of the function by default, so a guide on how to safely convert into the desired SegmentedCells object would be of great value. Digging around, some errors appear to be due to different behavior in the bioconductor/conda versus development versions regarding whether data frames are fully allowed as inputs. @sarahsamorodnitsky's reprex, followed by manual addition of pheno metadata using this guide, was what allowed me to create a valid object that could run spicy() without causing immediate errors. However, I still have some NA values in the test output that I'm trying to troubleshoot, and it's unclear if the issue is related to the input format or if there's a more underlying issue with the data and/or model

nick-robo commented 3 months ago

SegmentedCells has been deprecated in the latest version of spicyR, it now accepts SingleCellExperiment and data.frames