xmuhuanglab / eNet

10 stars 5 forks source link

GPTab is NULL #8

Open jphe opened 1 year ago

jphe commented 1 year ago

Hi, I'm runing the eNet for my own data, but the GPTab is NULL and without any error message:

Bellow is my code:

cre.mat <- readRDS("signac.peaks.rds") # peak-cell matrix
rna.mat <- readRDS("signac.RNA.rds") # scRNA-cell matrix
load("UMAP.RData") # UMAP coordinates
load("metadata.RData") # Cell metadata

source('/Users/jiangpinghe/jphe/tools/tmp/eNet/R/MainFunc.R')
source('/Users/jiangpinghe/jphe/tools/tmp/eNet/R/utils.R')

ATAC <- cre.mat@counts
RNA <- rna.mat@counts

# --- 2.1 calculate gene-peak correlation
GPTab <- GPCor(cre.mat=ATAC,  # peak-cell matrix
               exp.mat=RNA,  # scRNA-seq matrix or gene activity matrix
               normalizeRNAMat=T, # if the rna matrix need normalization
               genome = "mm10", # reference genome, must be one of "hg19", "mm10", or "hg38"
               windowPadSize = 100000, # base pairs padded on either side of gene TSS
               proPadSize = 2000, # base pairs padded on either side of gene TSS for enhancer
               nCores=3 # How many registerCores to use

And the message output by eNet:

Centering counts for cells sequentially in groups of size  1000  ..

Computing centered counts for cells:  1  to  721 ..
Computing centered counts per cell using mean reads in features ..

Computing centered counts per cell using mean reads in features ..

Merging results..
Done!
Performing log-normalization
Merging results..
Done!
Performing log-normalization
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************************************************|
*|
Assuming paired scATAC/scRNA-seq data ..
Assuming paired scATAC/scRNA-seq data ..
Peaks with 0 accessibility across cells exist ..
Removing these peaks prior to running correlations ..
Peaks with 0 accessibility across cells exist ..
Removing these peaks prior to running correlations ..
Important: peak indices in returned gene-peak maps are relative to original input SE
Important: peak indices in returned gene-peak maps are relative to original input SE
Genes with 0 expression across cells exist ..
Removing these genes prior to running correlations ..
Number of peaks in ATAC data: 172866 
Number of genes in RNA data: 21331 
Number of peaks in ATAC data: 172866 
Number of genes in RNA data: 21331 

Taking peak summits from peak windows ..

Taking peak summits from peak windows ..
Finding overlapping promoter-gene pairs ..
Finding overlapping promoter-gene pairs ..
Found  29161 total promoter-peak pairs for given TSS window ..
Number of peak summits that overlap any gene promoter window:  25194 
Number of gene promoter windows that overlap any peak summit:  21346 

Found  29161 total promoter-peak pairs for given TSS window ..
Number of peak summits that overlap any gene promoter window:  25194 
Number of gene promoter windows that overlap any peak summit:  21346 

Num genes overlapping TSS annotation and RNA matrix being considered:  18046 

Num genes overlapping TSS annotation and RNA matrix being considered:  18046 
Finding overlapping peak-gene pairs ..
Finding overlapping peak-gene pairs ..
Found  394761 total gene-peak pairs for given TSS window ..
Found  394761 total gene-peak pairs for given TSS window ..
Number of peak summits that overlap any gene TSS window:  120891 
Number of peak summits that overlap any gene TSS window:  120891 
Number of gene TSS windows that overlap any peak summit:  17871 

Number of gene TSS windows that overlap any peak summit:  17871 

But check the GPTab, it is empty:

> GPTab
NULL

and raise an error in the next step:

GPTabFilt <- FindNode(GPTab = GPTab, # data frame of gene-peak correlation
                      genome = "mm10", # reference genome, must be one of "hg19", "mm10", or "hg38"
                      estimate = 0, # the threshold value of peak-gene correlation to determine whether the peak-gene pair is significantly correlated
                      proPadSize = 2000, # base pairs padded on either side of gene TSS for enhancer
                      FDR = 0.05 # the threshold value of FDR to determine whether the peak-gene pair is significantly correlated
)
Error in `separate()`:
! Can't extract columns that don't exist.
✖ Column `ranges` doesn't exist.
Run `rlang::last_error()` to see where the error occurred.
Called from: signal_abort(cnd)
Error in `separate()`:
! Can't extract columns that don't exist.
✖ Column `ranges` doesn't exist.
Run `rlang::last_error()` to see where the error occurred.
Called from: signal_abort(cnd)
Browse[1]> 

How to solve this problem, thanks

dhoneyi commented 1 year ago

Hi,

Due to your problem, you can check the packages "IRanges" and "dplyr" to make sure they work. If it still have errors, maybe you can run the code line by line to find the code that reports the error.Thanks

Best wishes, Danni

L1angyan commented 12 months ago

I had the same problem

> GPTab <- GPCor(cre.mat=cre.mat,
+                exp.mat=rna.mat,
+                normalizeRNAMat=F,
+                genome = "hg19",
+                windowPadSize = 100000, # base pairs padded on either side of gene TSS
+                proPadSize = 2000, # base pairs padded on either side of gene TSS for enhancer
+                nCores=4 # How many registerCores to use
+ )
Centering counts for cells sequentially in groups of size  1000  ..

Computing centered counts for cells:  1  to  1000 ..
Computing centered counts per cell using mean reads in features ..

Merging results..
Done!
Assuming paired scATAC/scRNA-seq data ..
Peaks with 0 accessibility across cells exist ..
Removing these peaks prior to running correlations ..
Important: peak indices in returned gene-peak maps are relative to original input SE
Genes with 0 expression across cells exist ..
Removing these genes prior to running correlations ..
Number of peaks in ATAC data: 199945 
Number of genes in RNA data: 1424 

Taking peak summits from peak windows ..
Finding overlapping promoter-gene pairs ..
Found  54649 total promoter-peak pairs for given TSS window ..
Number of peak summits that overlap any gene promoter window:  45946 
Number of gene promoter windows that overlap any peak summit:  17429 

Num genes overlapping TSS annotation and RNA matrix being considered:  1405 
Finding overlapping peak-gene pairs ..
Found  33885 total gene-peak pairs for given TSS window ..
Number of peak summits that overlap any gene TSS window:  12707 
Number of gene TSS windows that overlap any peak summit:  1393 

> # --- 2.2 obtain significantly correlated peak-gene pairs
> # For determining the threshold value of estimate, namely peak-gene correlation, we advice to choose the value of quantile 95% of overall estimate using quantile(GPTab$estimate, seq(0,1,0.05))[['95%']]
> GPTabFilt <- FindNode(GPTab = GPTab, # data frame of gene-peak correlation
+                       genome = "hg19", # reference genome, must be one of "hg19", "mm10", or "hg38"
+                       estimate = 0, # the threshold value of peak-gene correlation to determine whether the peak-gene pair is significantly correlated
+                       proPadSize = 2000, # base pairs padded on either side of gene TSS for enhancer
+                       FDR = 0.05 # the threshold value of FDR to determine whether the peak-gene pair is significantly correlated
+ )
Error in `separate()`:
! Can't extract columns that don't exist.
x Column `ranges` doesn't exist.
Run `rlang::last_trace()` to see where the error occurred.
> GPTab
NULL
XunuoZhu commented 7 months ago

I commented out the parameter ’.errorhandling = 'remove'‘ in ’GPTab <- foreach(..)‘ and then realized that the error was probably caused by 'data.shuf <- data.shuf[sample(narrow(data.shuf)),]' in the 'PeakGeneCor' function. @dhoneyi