Open mikldk opened 2 years ago
In functional normalization we treat the sex chromosomes different. To do so, we need to know the sex of the sample.
With defautl settings, the first step is therefore to predict the sex based on the data and this step fails. We know it will fail if you only have one sex. To handle this, it is also possible to supply the sex of the sample, which overrides the prediction step.
In your case, you should be able to handle this by supplying the sex of the sample as female. I would guess this should work. However, what would be more difficult is to handle the situation where you want to normalize this cell line together with samples with 2 X chromsomes and therefore an inactivated X. Is that something you need to do?
Best, Kasper
On Fri, Nov 11, 2022 at 3:16 PM Mikkel Meyer Andersen < @.***> wrote:
I get an error in preprocessFunnorm(), see MWE below. The data is from a cell line without the Y chromosome, which may be what causes the error? (This is maybe related to #179 https://github.com/hansenlab/minfi/issues/179.)
Any suggestions on how I can get preprocessFunnorm() to work? preprocessRaw() and preprocessIllumina() works fine. I am able to supply data (privately).
library(minfi)
packageVersion("minfi")
[1] ‘1.42.0’
rgset <- minfi::read.metharray("data/20201002/203991460101/203991460101_R01C01")
Warnings:
1: I readChar(con, nchars = n) : truncating string with embedded nuls
2: I readChar(con, nchars = n) : truncating string with embedded nuls
rgset
class: RGChannelSet
dim: 1051815 1
metadata(0):
assays(2): Green Red
rownames(1051815): 1600101 1600111 ... 99810990 99810992
rowData names(0):
colnames(1): 203991460101_R01C01
colData names(0):
Annotation
array: IlluminaHumanMethylationEPIC
annotation: ilm10b4.hg19
mset <- preprocessFunnorm(rgset)
[preprocessFunnorm] Background and dye bias correction with noob
Loading required package: IlluminaHumanMethylationEPICmanifest
Loading required package: IlluminaHumanMethylationEPICanno.ilm10b4.hg19
[preprocessFunnorm] Mapping to genome
[preprocessFunnorm] Quantile extraction
Error in kmeans(dd, centers = c(min(dd), max(dd))) :
initial centers are not distinct
— Reply to this email directly, view it on GitHub https://github.com/hansenlab/minfi/issues/239, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABF2DH4GZ2WMXLRJVHVC74LWH2SSNANCNFSM6AAAAAAR547KQ4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>
-- Best, Kasper
@kasperdanielhansen Thanks for fast reply. If I instead run preprocessFunnorm(rgset, sex = "F")
I get this error:
> mset <- preprocessFunnorm(rgset, sex = "F")
[preprocessFunnorm] Background and dye bias correction with noob
[preprocessFunnorm] Mapping to genome
[preprocessFunnorm] Quantile extraction
[preprocessFunnorm] Normalization
Error in oobG[2, ] : subscript out of bounds
> traceback()
3: .buildControlMatrix450k(extractedData)
2: .normalizeFunnorm450k(object = gmSet, extractedData = extractedData,
sex = sex, nPCs = nPCs, verbose = subverbose)
1: preprocessFunnorm(rgset, sex = "F")
No, I only need this one sample, not mixed with others.
I get similar if I try instead with quantile normalisation:
> mset <- preprocessQuantile(rgset)
[preprocessQuantile] Mapping to genome.
Error in kmeans(dd, centers = c(min(dd), max(dd))) :
initial centers are not distinct
> mset <- preprocessQuantile(rgset, sex = "F")
[preprocessQuantile] Mapping to genome.
[preprocessQuantile] Fixing outliers.
[preprocessQuantile] Quantile normalizing.
Error in if (ncol(mat) == 1) return(mat) : argument is of length zero
@kasperdanielhansen Do you have any ideas of what can cause this? Again, I can send you the idat-files (via a private channel), if that can help?
In most preprocessing methods we need to do something special for the sex chromosome (for example due to X inactivation). To do this well, we need to know the sex of the samples. We have a standard way of estimating the sex of the samples using kmeans, and this step fails. It could fail for a number of reasons, the top contenders are (a) you only have 1 sex (the code assumes there are both males and females) (b) you have cancer samples with big CN changes on the sex chromosomes.
In case (a) or (b) you can override the prediction by directly supplying a vector of sex.
In case (a) or (b) you can override the prediction by directly supplying a vector of sex.
@kasperdanielhansen I already tried with the sex = "F"
argument (cf. above), and it still fails. Is there another way to supply the sex?
Ok, I am sorry, I see I basically wrote the same thing twice.
I also see you're trying to run 1 sample through functional normalization, right? That won't work. We essentially remove between-sample variation by regressing out certain confounders and that approach won't work for 1 sample processing.
Are you working in a prediction setting? If so, I would look into using Noob (in its single-sample mode, which is the default). Noob is "true" single sample normalization which means normalizing 1 sample is the same as normalizing many samples together.
I also see you're trying to run 1 sample through functional normalization, right? That won't work. We essentially remove between-sample variation by regressing out certain confounders and that approach won't work for 1 sample processing.
Yes, I started with that. But I also tried preprocessQuantile
- that should work with one sample, right?
Are you working in a prediction setting? If so, I would look into using Noob (in its single-sample mode, which is the default). Noob is "true" single sample normalization which means normalizing 1 sample is the same as normalizing many samples together.
Thanks, I will try that, too.
I get an error in
preprocessFunnorm()
, see MWE below. The data is from a cell line without the Y chromosome, which may be what causes the error? (This is maybe related to #179.)Any suggestions on how I can get
preprocessFunnorm()
to work?preprocessRaw()
andpreprocessIllumina()
works fine. I am able to supply data (privately).