kevinblighe / scDataviz

scDataviz: single cell dataviz and downstream analyses
61 stars 17 forks source link

I need help with an early error #20

Closed beginner984 closed 3 years ago

beginner984 commented 3 years ago

Hello and thanks in advance

I have three .fcs files and marker list for CYTOF

I am getting error in this part of your tutorial

> p <- pca(assay(sce, 'scaled'), metadata = metadata(sce))
Error in svd(x, nu = nu, nv = nv) : infinite or missing values in 'x'

I have attached my full R object here

https://www.dropbox.com/s/jq7ubwkiyn17dbn/My_data.RData?dl=0

Actually I don't know what the inclusion here is

Inclusions <- c('Yb171Di','Nd144Di','Nd145Di',
    'Er168Di','Tm169Di','Sm154Di','Yb173Di','Yb174Di',
    'Lu175Di','Nd143Di')

  markernames <- c("CD45 ",   "MHC II ",  "CD11b ",   "Ly6C " ,   "Ly6G " ,   "F4/80 ",   "CD11c "  , "CD38 ",    "Arg-1 " ,  "SiglecF " , "CD206 ",   "CD62L " ,  "CD103 ",   "iNOS " , "PD-L1 ",   "TNFa " ,   "CD64 "  ,  "TCRgd " ,  "Foxp3 " ,  "RORgt "   ,"CD8α "   , "Tbet "  ,  "CD25 " ,   "IFN-γ "  , "CD44 " ,   "CD86 "  ,  "CD80 " ,   "PD-1 " ,  "B220 "  ,  "NK1.1 "  , "CD19 " ,  "CD4 "    , "TCR β "  
> )

  names(markernames) <- inclusions
  markernames

  exclusions <- c('Time','Event_length','BCKG190Di',
    'Center','Offset','Width','Residual')

Anyway, for markernames I have provided my own markers but for inclusion I have used your stuff (I REALLY DON'T KNOW WHAT SHOULD PLACE HERE IN MY CASE) I think I have made sc object

Please may you have a look to see what is going wrong here?

kevinblighe commented 3 years ago

Hi, the error from PCA indicates that there may exist cells of constant variance in your data. Have you used downsampleVar? Can you show your processFCS() command?

Regarding the other issue, usually, one should know the marker names that are being used in the experiment. If you are unaware of these markers, then please try this code on one sample to see what are the markers:

  library(flowCore)
  pData(parameters(
    read.FCS(filelist[[1]], transformation = FALSE, emptyValue = FALSE)))

Then, you will know which to include and exclude.

beginner984 commented 3 years ago

Thank you so much I got this

> processFCS(filelist)
--harmonising markers across samples
--filtering background / noise
--transforming data
--removing the lower 10% of cells based on variance
--downsampling to 1e+05 variables.
class: SingleCellExperiment 
dim: 66 100000 
metadata(2): sample1 sample2
assays(1): scaled
rownames(66): Ba138Di BCKG190Di ... Yb174Di Yb176Di
rowData names(0):
colnames(100000): cell1 cell2 ... cell99999 cell100000
colData names(0):
reducedDimNames(0):
altExpNames(0):
Warning message:
In processFCS(filelist) : No metadata detected - creating generic metadata
>
 pData(parameters(
+     read.FCS(filelist[[1]], transformation = FALSE, emptyValue = FALSE)))
             name            desc   range minRange maxRange
$P1          Time            <NA> 1048576        0  1048575
$P2  Event_length            <NA>    4096        0     4095
$P3         Y89Di             89Y    4096        0     4095
$P4       Pd102Di           102Pd    4096        0     4095
$P5       Rh103Di           103Rh    4096        0     4095
$P6       Pd104Di           104Pd    4096        0     4095
$P7       Pd105Di           105Pd    4096        0     4095
$P8       Pd106Di           106Pd    4096        0     4095
$P9       Pd108Di           108Pd    4096        0     4095
$P10      Pd110Di           110Pd    4096        0     4095
$P11      Sn120Di           120Sn   32768        0    32767
$P12       I127Di            127I    4096        0     4095
$P13      Xe131Di           131Xe    4096        0     4095
$P14      Cs133Di           133Cs    4096        0     4095
$P15      Ba138Di           138Ba   32768        0    32767
$P16      La139Di           139La   65536        0    65535
$P17      Ce140Di           140Ce   32768        0    32767
$P18      Pr141Di      141Pr_CD45   65536        0    65535
$P19      Ce142Di           142Ce   32768        0    32767
$P20      Nd142Di    142Nd_MHC_II   32768        0    32767
$P21      Nd143Di     143Nd_CD11b    8192        0     8191
$P22      Nd144Di      144Nd_Ly6C   16384        0    16383
$P23      Nd145Di      145Nd_Ly6G    8192        0     8191
$P24      Nd146Di      146Nd_F480   16384        0    16383
$P25      Sm147Di     147Sm_CD11c   16384        0    16383
$P26      Nd148Di      148Nd_CD38    8192        0     8191
$P27      Sm149Di           149Sm    4096        0     4095
$P28      Nd150Di   150Nd_SiglecF    8192        0     8191
$P29      Eu151Di     151Eu_CD206   16384        0    16383
$P30      Sm152Di     152Sm_CD62L    4096        0     4095
$P31      Eu153Di     153Eu_CD103   16384        0    16383
$P32      Sm154Di      154Sm_iNOS    4096        0     4095
$P33      Gd155Di     155Gd_PD-L1    8192        0     8191
$P34      Gd156Di     156Gd_Arg-1   16384        0    16383
$P35      Gd158Di      158Gd_CD64   65536        0    65535
$P36      Tb159Di     159Tb_TCRgt    4096        0     4095
$P37      Gd160Di     160Gd_Foxp3   32768        0    32767
$P38      Dy161Di     161Dy_RORgt    4096        0     4095
$P39      Dy162Di      162Dy_CD8a   32768        0    32767
$P40      Dy163Di      163Dy_Tbet   16384        0    16383
$P41      Dy164Di      164Dy_CD25    4096        0     4095
$P42      Ho165Di     165Ho_IFN-g   16384        0    16383
$P43      Er166Di      166Er_CD44   65536        0    65535
$P44      Er167Di      167Er_CD86   16384        0    16383
$P45      Er168Di      168Er_CD80    4096        0     4095
$P46      Tm169Di      169Tm_PD-1   65536        0    65535
$P47      Er170Di      170Er_B220   65536        0    65535
$P48      Yb171Di     171Yb_NK1.1    4096        0     4095
$P49      Yb172Di     172Yb_TNF-a    4096        0     4095
$P50      Yb173Di      173Yb_CD19    4096        0     4095
$P51      Yb174Di       174Yb_CD4    4096        0     4095
$P52      Lu175Di      175Lu_TCRb   16384        0    16383
$P53      Yb176Di           176Yb    4096        0     4095
$P54    BCKG190Di         190BCKG    4096        0     4095
$P55      Ir191Di       191Ir_DNA   32768        0    32767
$P56      Ir193Di       193Ir_dna   65536        0    65535
$P57      Pt194Di           194Pt   32768        0    32767
$P58      Pt195Di 195Pt_live_dead   32768        0    32767
$P59      Pt196Di           196Pt   32768        0    32767
$P60      Pt198Di           198Pt   16384        0    16383
$P61      Pb208Di           208Pb   65536        0    65535
$P62      Bi209Di           209Bi    4096        0     4095
$P63       Center            <NA>   16384        0    16383
$P64       Offset            <NA>    4096        0     4095
$P65        Width            <NA>    4096        0     4095
$P66     Residual            <NA>    4096        0     4095
> 
> 
kevinblighe commented 3 years ago

Oh yeh, see, this usually and annoyingly happens with CyToF data. The marker names are just the metals, while the key information is in the pData under the 'desc' column. You will have to look at the list and choose which metals you want to keep. It looks like you will want those metals listed from row 18 to 52; so, these would be listed under Inclusions and passed to the function as colsRetain.

You can then rename these metals to have the protein names, such as CD4, PD1, etc., by using the newColnames parameter. This is mentioned in the first example in the vignette, if you can try that? - https://github.com/kevinblighe/scDataviz#tutorial-1-cytof-fcs-data

This package was difficult to develop based on how annoying is cytometry data, with no standards. We did recently publish a workflow which utilised some scDataviz functions 'under the hood': https://elifesciences.org/articles/62915

beginner984 commented 3 years ago

Thank you so much I get the same error

This is my full work out

> filelist <- list.files(
+    
+     pattern = ".fcs")
> filelist
[1] "fcs_raw1.fcs" "fcs_raw2.fcs" "fcs_raw3.fcs"
> 
> 

data=pData(parameters(
    read.FCS(filelist[[1]], transformation = FALSE, emptyValue = FALSE)))

> data
             name            desc   range minRange maxRange
$P1          Time            <NA> 1048576        0  1048575
$P2  Event_length            <NA>    4096        0     4095
$P3         Y89Di             89Y    4096        0     4095
$P4       Pd102Di           102Pd    4096        0     4095
$P5       Rh103Di           103Rh    4096        0     4095
$P6       Pd104Di           104Pd    4096        0     4095
$P7       Pd105Di           105Pd    4096        0     4095
$P8       Pd106Di           106Pd    4096        0     4095
$P9       Pd108Di           108Pd    4096        0     4095
$P10      Pd110Di           110Pd    4096        0     4095
$P11      Sn120Di           120Sn   32768        0    32767
$P12       I127Di            127I    4096        0     4095
$P13      Xe131Di           131Xe    4096        0     4095
$P14      Cs133Di           133Cs    4096        0     4095
$P15      Ba138Di           138Ba   32768        0    32767
$P16      La139Di           139La   65536        0    65535
$P17      Ce140Di           140Ce   32768        0    32767
$P18      Pr141Di      141Pr_CD45   65536        0    65535
$P19      Ce142Di           142Ce   32768        0    32767
$P20      Nd142Di    142Nd_MHC_II   32768        0    32767
$P21      Nd143Di     143Nd_CD11b    8192        0     8191
$P22      Nd144Di      144Nd_Ly6C   16384        0    16383
$P23      Nd145Di      145Nd_Ly6G    8192        0     8191
$P24      Nd146Di      146Nd_F480   16384        0    16383
$P25      Sm147Di     147Sm_CD11c   16384        0    16383
$P26      Nd148Di      148Nd_CD38    8192        0     8191
$P27      Sm149Di           149Sm    4096        0     4095
$P28      Nd150Di   150Nd_SiglecF    8192        0     8191
$P29      Eu151Di     151Eu_CD206   16384        0    16383
$P30      Sm152Di     152Sm_CD62L    4096        0     4095
$P31      Eu153Di     153Eu_CD103   16384        0    16383
$P32      Sm154Di      154Sm_iNOS    4096        0     4095
$P33      Gd155Di     155Gd_PD-L1    8192        0     8191
$P34      Gd156Di     156Gd_Arg-1   16384        0    16383
$P35      Gd158Di      158Gd_CD64   65536        0    65535
$P36      Tb159Di     159Tb_TCRgt    4096        0     4095
$P37      Gd160Di     160Gd_Foxp3   32768        0    32767
$P38      Dy161Di     161Dy_RORgt    4096        0     4095
$P39      Dy162Di      162Dy_CD8a   32768        0    32767
$P40      Dy163Di      163Dy_Tbet   16384        0    16383
$P41      Dy164Di      164Dy_CD25    4096        0     4095
$P42      Ho165Di     165Ho_IFN-g   16384        0    16383
$P43      Er166Di      166Er_CD44   65536        0    65535
$P44      Er167Di      167Er_CD86   16384        0    16383
$P45      Er168Di      168Er_CD80    4096        0     4095
$P46      Tm169Di      169Tm_PD-1   65536        0    65535
$P47      Er170Di      170Er_B220   65536        0    65535
$P48      Yb171Di     171Yb_NK1.1    4096        0     4095
$P49      Yb172Di     172Yb_TNF-a    4096        0     4095
$P50      Yb173Di      173Yb_CD19    4096        0     4095
$P51      Yb174Di       174Yb_CD4    4096        0     4095
$P52      Lu175Di      175Lu_TCRb   16384        0    16383
$P53      Yb176Di           176Yb    4096        0     4095
$P54    BCKG190Di         190BCKG    4096        0     4095
$P55      Ir191Di       191Ir_DNA   32768        0    32767
$P56      Ir193Di       193Ir_dna   65536        0    65535
$P57      Pt194Di           194Pt   32768        0    32767
$P58      Pt195Di 195Pt_live_dead   32768        0    32767
$P59      Pt196Di           196Pt   32768        0    32767
$P60      Pt198Di           198Pt   16384        0    16383
$P61      Pb208Di           208Pb   65536        0    65535
$P62      Bi209Di           209Bi    4096        0     4095
$P63       Center            <NA>   16384        0    16383
$P64       Offset            <NA>    4096        0     4095
$P65        Width            <NA>    4096        0     4095
$P66     Residual            <NA>    4096        0     4095
> 
> inclusions <- data$name[c(18,20:26,27:48,50:52)]
> 
> inclusions
    $P18N     $P20N     $P21N     $P22N     $P23N     $P24N     $P25N     $P26N 
"Pr141Di" "Nd142Di" "Nd143Di" "Nd144Di" "Nd145Di" "Nd146Di" "Sm147Di" "Nd148Di" 
    $P27N     $P28N     $P29N     $P30N     $P31N     $P32N     $P33N     $P34N 
"Sm149Di" "Nd150Di" "Eu151Di" "Sm152Di" "Eu153Di" "Sm154Di" "Gd155Di" "Gd156Di" 
    $P35N     $P36N     $P37N     $P38N     $P39N     $P40N     $P41N     $P42N 
"Gd158Di" "Tb159Di" "Gd160Di" "Dy161Di" "Dy162Di" "Dy163Di" "Dy164Di" "Ho165Di" 
    $P43N     $P44N     $P45N     $P46N     $P47N     $P48N     $P50N     $P51N 
"Er166Di" "Er167Di" "Er168Di" "Tm169Di" "Er170Di" "Yb171Di" "Yb173Di" "Yb174Di" 
    $P52N 
"Lu175Di" 
> inclusions=as.vector(inclusions)
> inclusions
 [1] "Pr141Di" "Nd142Di" "Nd143Di" "Nd144Di" "Nd145Di" "Nd146Di" "Sm147Di" "Nd148Di"
 [9] "Sm149Di" "Nd150Di" "Eu151Di" "Sm152Di" "Eu153Di" "Sm154Di" "Gd155Di" "Gd156Di"
[17] "Gd158Di" "Tb159Di" "Gd160Di" "Dy161Di" "Dy162Di" "Dy163Di" "Dy164Di" "Ho165Di"
[25] "Er166Di" "Er167Di" "Er168Di" "Tm169Di" "Er170Di" "Yb171Di" "Yb173Di" "Yb174Di"
[33] "Lu175Di"
> 

> exclusions=data$name[c(1:17,19,27,49,53:66)]
> exclusions=as.vector(exclusions)
> exclusions
 [1] "Time"         "Event_length" "Y89Di"        "Pd102Di"      "Rh103Di"     
 [6] "Pd104Di"      "Pd105Di"      "Pd106Di"      "Pd108Di"      "Pd110Di"     
[11] "Sn120Di"      "I127Di"       "Xe131Di"      "Cs133Di"      "Ba138Di"     
[16] "La139Di"      "Ce140Di"      "Ce142Di"      "Sm149Di"      "Yb172Di"     
[21] "Yb176Di"      "BCKG190Di"    "Ir191Di"      "Ir193Di"      "Pt194Di"     
[26] "Pt195Di"      "Pt196Di"      "Pt198Di"      "Pb208Di"      "Bi209Di"     
[31] "Center"       "Offset"       "Width"        "Residual"    
>
> markernames
 [1] "CD45"    "MHC II"  "CD11b"   "Ly6C"    "Ly6G"    "F4/80"   "CD11c"   "CD38"   
 [9] "Arg-1"   "SiglecF" "CD206"   "CD62L"   "CD103"   "iNOS"    "PD-L1"   "TNFa"   
[17] "CD64"    "TCRgd"   "Foxp3"   "RORgt"   "CD8α"    "Tbet"    "CD25"    "IFN-γ"  
[25] "CD44"    "CD86"    "CD80"    "PD-1"    "B220"    "NK1.1"   "CD19"    "CD4"    
[33] "TCR β"  
> names(markernames) <- inclusions
> markernames
  Pr141Di   Nd142Di   Nd143Di   Nd144Di   Nd145Di   Nd146Di   Sm147Di   Nd148Di 
   "CD45"  "MHC II"   "CD11b"    "Ly6C"    "Ly6G"   "F4/80"   "CD11c"    "CD38" 
  Sm149Di   Nd150Di   Eu151Di   Sm152Di   Eu153Di   Sm154Di   Gd155Di   Gd156Di 
  "Arg-1" "SiglecF"   "CD206"   "CD62L"   "CD103"    "iNOS"   "PD-L1"    "TNFa" 
  Gd158Di   Tb159Di   Gd160Di   Dy161Di   Dy162Di   Dy163Di   Dy164Di   Ho165Di 
   "CD64"   "TCRgd"   "Foxp3"   "RORgt"    "CD8α"    "Tbet"    "CD25"   "IFN-γ" 
  Er166Di   Er167Di   Er168Di   Tm169Di   Er170Di   Yb171Di   Yb173Di   Yb174Di 
   "CD44"    "CD86"    "CD80"    "PD-1"    "B220"   "NK1.1"    "CD19"     "CD4" 
  Lu175Di 
  "TCR β" 
> 
> sce <- processFCS(
+     files = filelist,
+     metadata = meta.dat,
+     transformation = TRUE,
+     transFun = function (x) asinh(x),
+     asinhFactor = 5,
+     downsample = 10000,
+     downsampleVar = 0.7,
+     colsRetain = inclusions,
+     colsDiscard = exclusions,
+     newColnames = markernames)
--harmonising markers across samples
--filtering background / noise
--transforming data
--removing the lower 70% of cells based on variance
--downsampling to 10000 variables.
> 

> library(PCAtools)
> p <- pca(assay(sce, 'scaled'), metadata = metadata(sce))
Error in svd(x, nu = nu, nv = nv) : infinite or missing values in 'x'
> 

I have updated my R object

https://www.dropbox.com/s/huzib0pnqvmjmtv/My_updated_data.RData?dl=0

Please have a look to see where I am doing wrong

kevinblighe commented 3 years ago

Hi, I looked at your data. The data just has all NA values; so, nothing can be done with it (see assay(sce, 'scaled')). I am unsure if it is because your experiment failed or for some other reason. If you just do the following, what do you see:

sce <- processFCS(files)
assay(sce, 'scaled')[1:5,1:5]

? One thing: why is there a space at the end of each variable in markernames?

beginner984 commented 3 years ago

Thank you

This is all NA even I have removed spaces at the end of markernames you can see that in my main post

We have 33 markers in this manuscript which I have placed here

> sce <- processFCS(files)
Error in data.frame(sample1 = files, sample2 = files, row.names = files) : 
  object 'files' not found
In addition: Warning message:
In processFCS(files) : No metadata detected - creating generic metadata
> assay(sce, 'scaled')[1:5,1:5]
        cell1 cell2 cell3 cell4 cell5
CD45       NA    NA    NA    NA    NA
MHC II     NA    NA    NA    NA    NA
CD11b      NA    NA    NA    NA    NA
Ly6C       NA    NA    NA    NA    NA
Ly6G       NA    NA    NA    NA    NA
> 
kevinblighe commented 3 years ago

Hi, this may not be an issue with scDataviz. Can you try to read one of these samples via flowCore to see if there is expression data in it?

beginner984 commented 3 years ago

Sorry how I read that by flowCore?

kevinblighe commented 3 years ago

Hi, have you progressed further? I am limited in how much help that I can offer. If you have general usage issues with R, then maybe try to find somebody local to assist.

beginner984 commented 3 years ago

Thank you so much

No I am trying yet to solve

kevinblighe commented 3 years ago

Regarding flowCore, you just need this function: https://rdrr.io/bioc/flowCore/man/read.FCS.html There is also a useful tutorial here, which may help: http://www.cbs.dtu.dk/courses/27485.imm/exercise_NGS/FACS_exercise.pdf

I would also talk to the core facility that produced the data for you. They may be able to suggest things for you to do.

kevinblighe commented 3 years ago

Please re-open if needed.