Open hanxuelian opened 2 weeks ago
Hi, It could be that there is a problem with how your input data looks like and thus fitting the autoencoder fails.
Do you do any filtering on the genes? i.e. ods <- filterExpression(ods, minCounts=TRUE, filterGenes=TRUE)
You could also plot the quality control plots to further investigate how the data looks, such as:
ods <- OUTRIDER::estimateSizeFactors(ods)
sizeFactors(ods) # check size factors
plotExpressedGenes(ods)
There was a similar issue that you might want to check out too :)
indeed, have a look also at the total counts per sample distribution, e.g. hist(colSums(counts(ods)))
does one sample stand out?
Thank you for your prompt reply. I calculated the size factor of the samples in the input data and found that many samples were not in the range of 0.7-1.2, especially the sample LYC that I am concerned about, whose size factor is only 0.04. After I retained LYC and GTEx samples with values between 0.7-1.2, the run was successful, but no outliers in LYC were reported.
Do I need to perform batch correction or normalize the gene counts of the input data?
ods class: OutriderDataSet class: RangedSummarizedExperiment dim: 1064 756 metadata(1): version assays(1): counts rownames(1064): ABCB1 ABCB7 ... ZNFX1 ZRSR2 rowData names(2): passedFilter loggeomeans colnames(756): LYC GTEX-111YS-0006-SM-5NQBE ... GTEX-ZXES-0005-SM-57WCB GTEX-ZXG5-0005-SM-57WCN colData names(2): sampleID sizeFactor
Hi @hanxuelian, unfortunately a size factor of 0.04 means the seq depth of your sample of interest is 25 times less the average of the cohort, so the OUTRIDER fit will not work. We usually recommend a seq depth of at least 40M to perform outlier analyses.
What do I need to do with the input data when this problem occurs? Filter genes?
Tue Nov 5 16:45:03 2024: SizeFactor estimation ... Tue Nov 5 16:45:03 2024: Controlling for confounders ... Using estimated q with: 125 Tue Nov 5 16:45:03 2024: Using the autoencoder implementation for controlling. [1] "Tue Nov 5 16:45:09 2024: Initial PCA loss: 7.07195181628894" [1] "Tue Nov 5 16:48:04 2024: Iteration: 1 loss: 5.96434324110687" [1] "Tue Nov 5 16:49:41 2024: Iteration: 2 loss: 5.92730984685159" [1] "Tue Nov 5 16:51:30 2024: Iteration: 3 loss: 5.90456699582469" [1] "Tue Nov 5 16:52:53 2024: Iteration: 4 loss: 5.89160117756947" [1] "Tue Nov 5 16:54:10 2024: Iteration: 5 loss: 5.88439742111325" [1] "Tue Nov 5 16:55:27 2024: Iteration: 6 loss: 5.87837515349512" [1] "Tue Nov 5 16:56:47 2024: Iteration: 7 loss: 5.87277495919609" [1] "Tue Nov 5 16:58:02 2024: Iteration: 8 loss: 5.86880499139098" Error: BiocParallel errors element index: 124, 125, 126, 127, 128, 129, ... first error: L-BFGS-B needs finite values of 'fn' In addition: Warning message: stop worker failed: attempt to select less than one element in OneIndex