markrobinsonuzh / cytofWorkflow

MIT License
14 stars 3 forks source link

Issue in the initial steps of data transformation and exploration #17

Closed sudu87 closed 4 years ago

sudu87 commented 4 years ago

Hello, I am facing this problem already in the beginning. I am not a big coder but I know how to work on R. I am unable to pin point the issue. Not sure what is different, as I am adapting exactly what has been mentioned in the workflow. Any help is great appreciated.

Thank you so much,

CyTOF analysis

Sudip Das 6/3/2020

Experiment information:

Sample information:

Instrument and QC:

Debarcoding -

Technical validation - OK

7 FCS files are sent along with the report by SDA (FCCF) on 5/06/20 QC has been performed by JD (bioinformatician, FCCF) on 4/06/20.

Version info:

Rstudio Version 1.2.5033

R.version
  ## platform       x86_64-apple-darwin15.6.0   
  ## arch           x86_64                      
  ## os             darwin15.6.0                
  ## system         x86_64, darwin15.6.0        
  ## version.string R version 3.6.3 (2020-02-29)
  ## nickname       Holding the Windsock        

Setting up directory

setwd("~/Documents/Research/Results/Cell_culture_and_macrophages/CyTOF")

Installing and loading packages.

All packages are reinstalled to new version

library(devtools)
## Loading required package: usethis
library(flowCore)
library(cytofCore)
library(readxl)
library(ggplot2)
library(reshape2)
library(matrixStats)
library(limma)
library(dplyr)
## 
## Attaching package: 'dplyr'

## The following object is masked from 'package:matrixStats':
## 
##     count

## The following object is masked from 'package:flowCore':
## 
##     filter

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggrepel)
library(RColorBrewer)
library(pheatmap)

Saving data and loading

save.image("cytof_exp1.RData")

Data import method Robinson lab

Reading metadata file

md <- read_excel("~/Documents/Research/Results/Cell_culture_and_macrophages/CyTOF/macros_metadata.xlsx")

Make sure condition variables are factors with the right levels

md$condition <- factor(md$condition, levels = c("undiff", "dmso","diff","lps","pgn","lps + pgn","gcm"))
data.frame(md)
##            file_name sample_id condition patient_id
## 1          monos.fcs     SDCY1    undiff       exp1
## 2     monos_dmso.fcs     SDCY2      dmso       exp1
## 3         macros.fcs     SDCY3      diff       exp1
## 4     macros_lps.fcs     SDCY4       lps       exp1
## 5     macros_pgn.fcs     SDCY5       pgn       exp1
## 6 macros_lps_pgn.fcs     SDCY6 lps + pgn       exp1
## 7     macros_gcm.fcs     SDCY7       gcm       exp1

Define colors for conditions

color_conditions <- c("#DC143C", "#228B22","#008B8B","#0000CD","#800080","#D2691E","#778899")
names(color_conditions) <- levels(md$condition)

Reading FCS filename

fcs_raw <- read.flowSet(md$file_name, path="~/Documents/Research/Results/Cell_culture_and_macrophages/CyTOF/20200603_Sudip_Human_Microbiota",transformation = FALSE, truncate_max_range = FALSE)
fcs_raw
## A flowSet with 7 experiments.
## 
##   column names:
##   Au197Di BCKG190Di Ba138Di Bi209Di Ce140Di Center Cs133Di Dy161Di Dy162Di Dy163Di Dy164Di Er166Di Er167Di Er168Di Er170Di Eu151Di Eu153Di Event_length Gd155Di Gd156Di Gd157Di Gd158Di Gd160Di Ho165Di I127Di In113Di In115Di Ir191Di Ir193Di La139Di Lu175Di Nd142Di Nd143Di Nd144Di Nd145Di Nd146Di Nd148Di Nd150Di Offset Pb208Di Pd102Di Pd104Di Pd105Di Pd106Di Pd108Di Pd110Di Pr141Di Pt194Di Pt195Di Pt198Di Residual Rh103Di Sm147Di Sm149Di Sm152Di Sm154Di Sn120Di Tb159Di Tm169Di Xe131Di Y89Di Yb171Di Yb172Di Yb173Di Yb174Di Yb176Di Time

Antigen information

panel <- read_excel("~/Documents/Research/Results/Cell_culture_and_macrophages/CyTOF/antigen_panel.xlsx")
data.frame(panel)
##    Metal Isotope  Antigen Lineage Functional
## 1     Dy     164 Siglec-8       0          1
## 2     Er     168    CD206       0          1
## 3     Er     170    CD169       0          1
## 4     Eu     151    CD11b       1          0
## 5     Gd     156   HLA-DR       0          1
## 6     Gd     160     CD14       1          0
## 7     Ho     165     CD64       0          1
## 8     Nd     142    CD11c       0          1
## 9     Nd     143     CD68       0          1
## 10    Nd     145     CD71       0          1
## 11    Nd     146    F4-80       0          1
## 12    Sm     154    PD-L1       0          1
## 13    Tm     169    CD163       0          1
## 14    Yb     171     CD86       0          1
## 15    Yb     173     CD81       0          1
## 16    Yb     174     CD88       0          1

Parameters

panel_fcs <- pData(parameters(fcs_raw[[1]]))
#panel_fcs<-na.omit(panel_fcs)
head(panel_fcs)
##          name    desc range minRange maxRange
## $P1   Au197Di   197Au  4096        0     4095
## $P2 BCKG190Di 190BCKG  4096        0     4095
## $P3   Ba138Di   138Ba  8192        0     8191
## $P4   Bi209Di   209Bi  4096        0     4095
## $P5   Ce140Di   140Ce 12000        0    11999
## $P6    Center    <NA> 12000        0    11999

Data transformation Robinson lab

Lineage markers

(lineage_markers <- panel$Antigen[panel$Lineage == 1])
## [1] "CD11b" "CD14"

Functional markers

(functional_markers <- panel$Antigen[panel$Functional == 1])
##  [1] "Siglec-8" "CD206"    "CD169"    "HLA-DR"   "CD64"     "CD11c"   
##  [7] "CD68"     "CD71"     "F4-80"    "PD-L1"    "CD163"    "CD86"    
## [13] "CD81"     "CD88"

arcsinh transformation and column subsetting

fcs <- fsApply(fcs_raw, function(x, cofactor = 5){
  colnames(x) <- panel_fcs$desc
  expr <- exprs(x)
  expr <- asinh(expr[, c(lineage_markers, functional_markers)] / cofactor)
  exprs(x) <- expr
  x
})
fcs
## Error in expr[, c(lineage_markers, functional_markers)] : subscript out of bounds

Diagnostic plots Robinson lab

Generate sample IDs corresponding to each cell in the expr matrix

sample_ids <- rep(md$sample_id, fsApply(fcs_raw, nrow))
head(sample_ids)
## [1] "SDCY1" "SDCY1" "SDCY1" "SDCY1" "SDCY1" "SDCY1"
tail(sample_ids)
## [1] "SDCY7" "SDCY7" "SDCY7" "SDCY7" "SDCY7" "SDCY7"
ggdf <- data.frame(sample_id = sample_ids, expr)
ggdf <- melt(ggdf, id.var = "sample_id", 
  value.name = "expression", variable.name = "antigen")
mm <- match(ggdf$sample_id, md$sample_id)
ggdf$condition <- md$condition[mm]

ggplot(ggdf, aes(x = expression, color = condition, 
  group = sample_id)) +
  geom_density() +
  facet_wrap(~ antigen, nrow = 4, scales = "free") +
  theme_bw() +
  theme(axis.text.x = element_text(angle = 90, hjust = 1), 
    strip.text = element_text(size = 7), axis.text = element_text(size = 5)) +
  scale_color_manual(values = color_conditions)
   ## Error in data.frame(sample_id = sample_ids, expr) : arguments imply differing number of rows: 578655, 0

Extract expression

expr <- fsApply(fcs, exprs)
dim(expr)
   ## Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘fsApply’ for signature ‘"flowFrame"
rng <- colQuantiles(expr, probs = c(0.01, 0.99))
expr01 <- t((t(expr) - rng[, 1]) / (rng[, 2] - rng[, 1]))
expr01[expr01 < 0] <- 0
expr01[expr01 > 1] <- 1
     ## Error: Argument 'x' is of class ‘function’, but should be a matrix. The use of a ‘function’ is not supported, the correctness of the result is not guaranteed. Please update your code accordingly.
HelenaLC commented 4 years ago

What version of the workflow are you following? It seems to me this code is at least 3 years old. All of the above should come down to <10 lines with the current version.

sudu87 commented 4 years ago

I am following this link: http://129.217.206.11/packages/3.7/workflows/vignettes/cytofWorkflow/inst/doc/cytofWorkflow.html

HelenaLC commented 4 years ago

Aha- may I suggest either this (short version): https://bioconductor.org/packages/release/bioc/vignettes/CATALYST/inst/doc/differential.html or this (long version): https://f1000research.com/articles/6-748/v4 The former should be much easier, if you are not too familiar with R. The later will give more theoretical background on each step.

sudu87 commented 4 years ago

Great thanks a lot ! I will give this a go and get back if there are any such issues.

HelenaLC commented 4 years ago

Also, I just noticed you are running R 3.6. I can highly recommend updating to 4.0+ to take advantage of the new features, and make sure you're not writing an analysis that is already "out-dated".

markrobinsonuzh commented 4 years ago

Agree with @HelenaLC .. these code snippets are very old. I would follow the workflow from BioC 3.11 (current Bioconductor release from April-October 2020):

https://bioconductor.org/packages/3.11/workflows/vignettes/cytofWorkflow/inst/doc/cytofWorkflow.html

(actually, the v4 link above is even a bit outdated)

Note: this will require you to upgrade to R 4.0 as well.

sudu87 commented 4 years ago

Thank you for your help! I will follow up on all of this.