Open pfransquet opened 6 years ago
Sounds weird. Will take a look when I’m back from travel.
Hi Kasper, sorry to keep spamming you but I thought I would update you rather than send you on a wild goose chase.
Firstly sorry for my fundamental knowledge of the subject of methylation array analysis.
I have just started my second year of PhD and am still getting my head around R.
I think the issue stems from the creation of the raw methylset (preprocessRaw) which will use the “IlluminaHumanMethylationEPICmanifest” which is based of the ‘MethylationEPIC_v-1-0_B2.csv’ manifest, rather than the B4 which the getAnnotation, read.metharray.exp, mapToGenome are built on.
I cross referenced the list of ‘missing’ probes I sent you to the B4 manifest file and found none of the 379 probes were in there at all.
Thinking something was up, I looked at the “Infinium MethylationEPIC v1.0 Missing Legacy CpG (B3 vs. B2) Annotation File”, and all of them were in there (although not exclusive as there were other probes not in my list that were also there)!
I don’t know the solution, but I hope this helps identify the issue (if there is one and it isn’t just me)
Kind Regards, Pete
From: Kasper Daniel Hansen notifications@github.com Sent: Thursday, 2 August 2018 4:51 PM To: hansenlab/minfi minfi@noreply.github.com Cc: pfransquet peter.fransquet@monash.edu; Author author@noreply.github.com Subject: Re: [hansenlab/minfi] mapToGenome drops probes "ilm10b4" annotation (#171)
Sounds weird. Will take a look when I’m back from travel.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/hansenlab/minfi/issues/171#issuecomment-409824922 , or mute the thread https://github.com/notifications/unsubscribe-auth/An1YYXHy6Y8gEAYWEsri-j8iSbxoBzR2ks5uMqFYgaJpZM4VrbI7 . https://github.com/notifications/beacon/An1YYekArHnWEi87Z65fB-_KtjydIbarks5uMqFYgaJpZM4VrbI7.gif
It'll be helpful if you include the code you run to get an error.
No problems. There are no errors with any of the steps leading up to and in creating the GenomicMethylationSet (gmset) just a discrepancy in probe numbers.
However, the first error i run into is when i try to check if the samples are sex matched in the gmset using plotSex:
sex.pred=getSex(gmset) ##works fine
plotSex(sex.pred, id = ifelse(targets$Gender == 1, 'M', 'F'))
resulting in error:
Error in all(c("predictedSex", "xMed", "yMed") %in% colnames(colData(object))) :
unable to find an inherited method for function ‘colData’ for signature ‘"DataFrame"’
I skipped this as i know all the samples are sex concordant (when using the b2 annotation)
So when i went on to remove the failed (< 0.01) probes from the gmset i create a 'keep' logical (which has 866,238 probes assessed as true or false, around 20k are false)
keep = rowSums(detp< 0.01) == ncol(gmset)
and then use it to get rid of the failed probes from the gmset (which has 865,859 probes)
gmset.fl = gmset[keep==TRUE,]
which returns the error:
Error: subscript is a logical vector with out-of-bounds TRUE values
detp (from detectionP) uses the rgset (from read.metharray.exp), so on that point it could be an issue with detectionP too because although the probes are removed from the annotation, the data is still there from the array.
I understand there may be a work around to ignore that error to remove the failed probes, but i was thinking it would be better to remove the probes prior to normalisation.
Thank you for your time, Pete
(@pfransquet I edited your post to use markdown. this formatting is really useful on github, especially for separating code and output from text. Check out https://guides.github.com/features/mastering-markdown/ to learn more)
Thank you @PeteHaitch , I'll make sure I use that format from now!
Hi,
Just checking did any one figured anything out regarding the probe number discrepancy. We are comparing results using partek (a commercial software) and minfi pipeline. We are finding some discrepancies between the result and one thing we found was that starting number of probes were different, For Partek, which uses Illumina Manifest file it is 865818, whereas for the IlluminaHumanMethylationEPICanno.ilm10b4.hg19 is 865859.
Just checking whether there is any particular reason behind this discrepancy.
Thanks, Surajit
Hi all, I am re-doing methylation analysis using the updated ilm10b4 annotation with EPIC data (rather than ilm10b2) and after cleaning the data and normalisation, when i get to mapping to genome using "mapToGenome" it works, but probe amounts drop from 866,238, to 865,859 (-379 probes).
So further on when i go to remove failed probes from the detectionP function, i get the error: subscript is a logical vector with out-of-bounds TRUE values
Because i have more probes to keep than there is actual probes in my genome set.
When using the ilm10b2 annotation it worked fine, both the methyl set and genome sets had the exact same amount of probes (866,238).
Any help in this issue will be greatly appreciated.
Kind Regards, Pete