Danko-Lab / BayesPrism

A Fully Bayesian Inference of Tumor Microenvironment composition and gene expression
137 stars 43 forks source link

Error in checkForRemoteErrors(val) : one node produced an error: subscript out of bounds #23

Open venkan opened 1 year ago

venkan commented 1 year ago

I'm bulk RNA-seq data and sc-RNA data with Ensembl Ids. And using cell type Fibroblasts.

> dim(bk.dat)
[1]   546 19988

> dim(sc.dat)
[1]  1159 19828

> sort(table(cell.type.labels))
Fibroblast 
      1159 
> sort(table(cell.state.labels))
cell.state.labels
fb-4 fb-1 fb-3 fb-2 fb-0 
  83  148  177  237  514 

Using the above data Constructed prism object like below:

> myPrism <- new.prism(
+   reference=sc.dat.filtered.pc,
+   mixture=bk.dat,
+   input.type="count.matrix",
+   cell.type.labels = cell.type.labels,
+   cell.state.labels = cell.state.labels,
+   key="Fibroblast",
+   outlier.cut=0.01,
+     outlier.fraction=0.1,
+ )
number of cells in each cell state
cell.state.labels
fb-4 fb-1 fb-3 fb-2 fb-0
  83  148  177  237  514
Number of outlier genes filtered from mixture = 9
Aligning reference and mixture...
Nornalizing reference...
Warning message:
In validityMethod(object) : Warning: pseudo.min does not match min(phi)

Then ran the Bayesprism like below:

> bp.res <- run.prism(prism = myPrism, n.cores=50)
Run Gibbs sampling...
Current time:  2022-12-02 21:03:14
Estimated time to complete:  1hrs 2mins
Estimated finishing time:  2022-12-02 22:04:15
Start run...
Explicit sfStop() is missing: stop now.

Stopping cluster

snowfall 1.84-6.2 initialized (using snow 0.4-4): parallel execution on 50 CPUs.

Stopping cluster

Update the reference matrix ...
snowfall 1.84-6.2 initialized (using snow 0.4-4): parallel execution on 50 CPUs.

Error in checkForRemoteErrors(val) :
  one node produced an error: subscript out of bounds

So, first I saw Explicit sfStop() is missing: stop now. then at the end of the run I saw the following error:

Error in checkForRemoteErrors(val) :
  one node produced an error: subscript out of bounds

Could you please help, how to resolve this error? thank you.

tinyi commented 1 year ago

Thank you for your interest in our work.

cell.type.labels should be your target granularity of cell types, and has to be greater than one type. In your case, I believe you should use "fb-4 fb-1 fb-3 fb-2 fb-0" as cell types.

Best,

Tinyi

On Fri, Dec 2, 2022 at 4:22 PM venkan @.***> wrote:

I'm bulk RNA-seq data and sc-RNA data with Ensembl Ids. And using cell type Fibroblasts.

dim(bk.dat) [1] 546 19988

dim(sc.dat) [1] 1159 19828

sort(table(cell.type.labels)) Fibroblast 1159 sort(table(cell.state.labels)) cell.state.labels fb-4 fb-1 fb-3 fb-2 fb-0 83 148 177 237 514

Using the above data Constructed prism object like below:

myPrism <- new.prism(

  • reference=sc.dat.filtered.pc,
  • mixture=bk.dat,
  • input.type="count.matrix",
  • cell.type.labels = cell.type.labels,
  • cell.state.labels = cell.state.labels,
  • key="Fibroblast",
  • outlier.cut=0.01,
  • outlier.fraction=0.1,
  • ) number of cells in each cell state cell.state.labels fb-4 fb-1 fb-3 fb-2 fb-0 83 148 177 237 514 Number of outlier genes filtered from mixture = 9 Aligning reference and mixture... Nornalizing reference... Warning message: In validityMethod(object) : Warning: pseudo.min does not match min(phi)

Then ran the Bayesprism like below:

bp.res <- run.prism(prism = myPrism, n.cores=50) Run Gibbs sampling... Current time: 2022-12-02 21:03:14 Estimated time to complete: 1hrs 2mins Estimated finishing time: 2022-12-02 22:04:15 Start run... Explicit sfStop() is missing: stop now.

Stopping cluster

snowfall 1.84-6.2 initialized (using snow 0.4-4): parallel execution on 50 CPUs.

Stopping cluster

Update the reference matrix ... snowfall 1.84-6.2 initialized (using snow 0.4-4): parallel execution on 50 CPUs.

Error in checkForRemoteErrors(val) : one node produced an error: subscript out of bounds

So, first I saw Explicit sfStop() is missing: stop now. then at the end of the run I saw the following error:

Error in checkForRemoteErrors(val) : one node produced an error: subscript out of bounds

Could you please help, how to resolve this error? thank you.

— Reply to this email directly, view it on GitHub https://github.com/Danko-Lab/BayesPrism/issues/23, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4NHSYFRNIAF4YIBY6GZELWLJSCNANCNFSM6AAAAAASSJ3ACA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

venkan commented 1 year ago

Thank you for your reply. I have cell type only Fibroblasts.

If I give cell.type.labels like fb-4 fb-1 fb-3 fb-2 fb-0 then what should be cell.state.labels?

tinyi commented 1 year ago

same as cell type

On Sat, Dec 3, 2022 at 6:10 AM venkan @.***> wrote:

Thank you for your reply. I have cell type only Fibroblasts.

If I give cell.type.labels like fb-4 fb-1 fb-3 fb-2 fb-0 then what should be cell.state.labels?

— Reply to this email directly, view it on GitHub https://github.com/Danko-Lab/BayesPrism/issues/23#issuecomment-1336136542, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4NHSYL4K3YDTIIA7ONEGDWLMTDFANCNFSM6AAAAAASSJ3ACA . You are receiving this because you commented.Message ID: @.***>

venkan commented 1 year ago

Thank you again. Could you please check this and tell me whether this is right?

As you said, I gave both cell.type and cell.state labels same:

> sort(table(cell.type.labels))
cell.type.labels
fb-4 fb-1 fb-3 fb-2 fb-0
  83  148  177  237  514
> sort(table(cell.state.labels))
cell.state.labels
fb-4 fb-1 fb-3 fb-2 fb-0
  83  148  177  237  514

Then Constructed prism object:

myPrism <- new.prism(
  reference=sc.dat.filtered.pc, 
  mixture=bk.dat,
  input.type="count.matrix", 
  cell.type.labels = cell.type.labels, 
  cell.state.labels = cell.state.labels,
  key=NULL,
  outlier.cut=0.01,
    outlier.fraction=0.1,
)

Then ran Bayesprism:

bp.res <- run.prism(prism = myPrism, n.cores=50)
bp.res

It looks like below:

> bp.res
Input prism info:
Cell states in each cell type:
$`fb-1`
[1] "fb-1"

$`fb-2`
[1] "fb-2"

$`fb-3`
[1] "fb-3"

$`fb-4`
[1] "fb-4"

$`fb-0`
[1] "fb-0"

Identifier of the malignant cell type:  NA
Number of cell states:  5
Number of cell types:  5
Number of mixtures:  546
Number of genes:  18677

Initial cell type fractions:
         fb-1  fb-2  fb-3  fb-4  fb-0
Min.    0.016 0.036 0.000 0.057 0.000
1st Qu. 0.084 0.283 0.021 0.344 0.069
Median  0.108 0.357 0.043 0.383 0.089
Mean    0.106 0.360 0.055 0.385 0.093
3rd Qu. 0.129 0.425 0.066 0.426 0.110
Max.    0.197 0.918 0.453 0.656 0.289
Updated cell type fractions:
         fb-1  fb-2  fb-3  fb-4  fb-0
Min.    0.000 0.001 0.000 0.003 0.000
1st Qu. 0.084 0.278 0.000 0.337 0.036
Median  0.121 0.364 0.017 0.387 0.071
Mean    0.116 0.359 0.055 0.385 0.085
3rd Qu. 0.150 0.445 0.053 0.434 0.111
Max.    0.242 0.938 0.958 0.700 0.351

To get the count matrix Z

> Z.tumor <- get.exp(bp=bp.res, state.or.type="type")
> str(Z.tumor)
 num [1:546, 1:18677, 1:5] 142.6 219.5 109.5 11.8 87.8 ...
 - attr(*, "dimnames")=List of 3
  ..$ : chr [1:546] "TCGA-CN-6997-01A" "TCGA-CR-6472-01A" "TCGA-BB-A5HU-01A" "TCGA-P3-A6T5-01A" ...
  ..$ : chr [1:18677] "ENSG00000000003" "ENSG00000000005" "ENSG00000000419" "ENSG00000000457" ...
  ..$ : chr [1:5] "fb-1" "fb-2" "fb-3" "fb-4" ...

Then I converted that to a dataframe:

> df <- as.data.frame(Z.tumor)
> head(df)[1:5,1:5]
                 ENSG00000000003.fb-1 ENSG00000000005.fb-1 ENSG00000000419.fb-1
TCGA-CN-6997-01A              142.556                    0              998.272
TCGA-CR-6472-01A              219.524                    0              403.528
TCGA-BB-A5HU-01A              109.548                    0              344.892
TCGA-P3-A6T5-01A               11.792                    0              416.628
TCGA-BA-4074-01A               87.844                    0              575.504
                 ENSG00000000457.fb-1 ENSG00000000460.fb-1
TCGA-CN-6997-01A              142.268               97.808
TCGA-CR-6472-01A               86.876               92.448
TCGA-BB-A5HU-01A               47.872               29.688
TCGA-P3-A6T5-01A               66.704               31.804
TCGA-BA-4074-01A               33.956               29.468
tinyi commented 1 year ago

Looks good to me. But it appears to me that you are trying to deconvolve TGCA tumor sample with only fibroblasts scRNA reference? BayesPrism requires a complete representation of cell types in the reference…

On Sat, Dec 3, 2022 at 6:31 PM venkan @.***> wrote:

Thank you again. Could you please check this and tell me whether this is right?

As you said, I gave both cell.type and cell.state labels same:

sort(table(cell.type.labels)) cell.type.labels fb-4 fb-1 fb-3 fb-2 fb-0 83 148 177 237 514 sort(table(cell.state.labels)) cell.state.labels fb-4 fb-1 fb-3 fb-2 fb-0 83 148 177 237 514

Then Constructed prism object:

myPrism <- new.prism( reference=sc.dat.filtered.pc, mixture=bk.dat, input.type="count.matrix", cell.type.labels = cell.type.labels, cell.state.labels = cell.state.labels, key=NULL, outlier.cut=0.01, outlier.fraction=0.1, )

Then ran Bayesprism:

bp.res <- run.prism(prism = myPrism, n.cores=50) bp.res

It looks like below:

bp.res Input prism info: Cell states in each cell type: $fb-1 [1] "fb-1"

$fb-2 [1] "fb-2"

$fb-3 [1] "fb-3"

$fb-4 [1] "fb-4"

$fb-0 [1] "fb-0"

Identifier of the malignant cell type: NA Number of cell states: 5 Number of cell types: 5 Number of mixtures: 546 Number of genes: 18677

Initial cell type fractions: fb-1 fb-2 fb-3 fb-4 fb-0 Min. 0.016 0.036 0.000 0.057 0.000 1st Qu. 0.084 0.283 0.021 0.344 0.069 Median 0.108 0.357 0.043 0.383 0.089 Mean 0.106 0.360 0.055 0.385 0.093 3rd Qu. 0.129 0.425 0.066 0.426 0.110 Max. 0.197 0.918 0.453 0.656 0.289 Updated cell type fractions: fb-1 fb-2 fb-3 fb-4 fb-0 Min. 0.000 0.001 0.000 0.003 0.000 1st Qu. 0.084 0.278 0.000 0.337 0.036 Median 0.121 0.364 0.017 0.387 0.071 Mean 0.116 0.359 0.055 0.385 0.085 3rd Qu. 0.150 0.445 0.053 0.434 0.111 Max. 0.242 0.938 0.958 0.700 0.351

To get the count matrix Z

Z.tumor <- get.exp(bp=bp.res, state.or.type="type") str(Z.tumor) num [1:546, 1:18677, 1:5] 142.6 219.5 109.5 11.8 87.8 ...

  • attr(*, "dimnames")=List of 3 ..$ : chr [1:546] "TCGA-CN-6997-01A" "TCGA-CR-6472-01A" "TCGA-BB-A5HU-01A" "TCGA-P3-A6T5-01A" ... ..$ : chr [1:18677] "ENSG00000000003" "ENSG00000000005" "ENSG00000000419" "ENSG00000000457" ... ..$ : chr [1:5] "fb-1" "fb-2" "fb-3" "fb-4" ...

Then I converted that to a dataframe:

df <- as.data.frame(Z.tumor) head(df)[1:5,1:5] ENSG00000000003.fb-1 ENSG00000000005.fb-1 ENSG00000000419.fb-1 TCGA-CN-6997-01A 142.556 0 998.272 TCGA-CR-6472-01A 219.524 0 403.528 TCGA-BB-A5HU-01A 109.548 0 344.892 TCGA-P3-A6T5-01A 11.792 0 416.628 TCGA-BA-4074-01A 87.844 0 575.504 ENSG00000000457.fb-1 ENSG00000000460.fb-1 TCGA-CN-6997-01A 142.268 97.808 TCGA-CR-6472-01A 86.876 92.448 TCGA-BB-A5HU-01A 47.872 29.688 TCGA-P3-A6T5-01A 66.704 31.804 TCGA-BA-4074-01A 33.956 29.468

— Reply to this email directly, view it on GitHub https://github.com/Danko-Lab/BayesPrism/issues/23#issuecomment-1336278188, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4NHS7ZHTQGZ4MCV35PY53WLPJ4JANCNFSM6AAAAAASSJ3ACA . You are receiving this because you commented.Message ID: @.***>

venkan commented 1 year ago

yes, actually I have only fibroblasts scRNA reference. So, using only that. I have other cell types like below:

     B.cell   Dendritic Endothelial  Fibroblast  Macrophage        Mast     myocyte      T.cell 
    967          15         205         1159       133           87          39             645 

but, I have cell.state.labelsfor onlyFibroblasts. Don't have anycell.state.labelsfor othercell.type.labels`. Do you think I can go forward with the above way in my previous comment?

tinyi commented 1 year ago

You would need a complete representation for the cell types in your tumor mixture, i.e. including the malignant cells. Please refer to the tutorial and Q&A for the definition of cell state and cell type.

On Sat, Dec 3, 2022 at 7:55 PM venkan @.***> wrote:

yes, actually I have only fibroblasts scRNA reference. So, using only that. I have other cell types like below:

 B.cell   Dendritic Endothelial  Fibroblast  Macrophage        Mast     myocyte      T.cell
    967          15         205             1159             133                     87          39             645

but, I have cell.state.labelsfor onlyFibroblasts. Don't have any cell.state.labelsfor othercell.type.labels`. Do you think I can go forward with the above way in my previous comment?

— Reply to this email directly, view it on GitHub https://github.com/Danko-Lab/BayesPrism/issues/23#issuecomment-1336290062, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4NHS354DUANU3TQDDL4G3WLPTWVANCNFSM6AAAAAASSJ3ACA . You are receiving this because you commented.Message ID: @.***>