zwdzwd / sesame

🍪 SEnsible Step-wise Analysis of DNA MEthylation BeadChips
Other
63 stars 33 forks source link

sesameDataCache() fails with no error #145

Open bethan-mallabar-rimmer opened 10 months ago

bethan-mallabar-rimmer commented 10 months ago

I reinstalled sesame today with BiocManager::install("zwdzwd/sesame"), BiocManager::install("zwdzwd/sesameData") and BiocManager::install("Bioconductor/ExperimentHub")

Old versions:

#sesame    sesameData ExperimentHub 
#"1.20.0"      "1.20.0"      "2.10.0" 

New versions:

#sesame    sesameData ExperimentHub 
#"1.21.7"      "1.21.9"      "2.11.1" 

Then ran sesameDataCache() which returned:

Metadata (N=95):
|==============================================================================================| 100%
(1/95) EH8551:

....etc

(84/95) EH7338:

  |==============================================================================================| 100%

But it stops at this point and does not cache the remaining files. The function just stops running with no error.

I tried running it again:

sesameDataCache()

Metadata (N=11):

Again the function stops and does not cache the remaining 11 files, but there is no error message.

Is there a fix for this please?

tamefelis commented 10 months ago

do you have any existing RData files thats affecting the caching process?

perhaps can try to delete every RData / history files and cache it again

bethan-mallabar-rimmer commented 10 months ago

Thanks for the suggestion - I deleted the old sesame 1.20 cache (stored in /Users/[username-here]/Library/Caches/org.R-project.R/R/ExperimentHub on mac) before updating to sesame 1.21 but did not delete any RData/history files, so you may be right.

In the end I reverted to sesame 1.20 again, but this also caused issues and only worked after uninstalling R and RStudio, deleting all packages and history files, and reinstalling everything. So I'm a bit reluctant to try again, but may try on a spare laptop

tamefelis commented 10 months ago

Yeah,

if work cases doesn't really take long to do so, maybe might consider to not save the Rhistories

also can try to run some functions to fix the versions of the packages that you using

hope your issue can be resolved in near future!

seq101 commented 8 months ago

@zwdzwd @tamefelis @bethan-mallabar-rimmer Any progress on this issue? I can't get the inferSex function to work with EpicV2Array. The problem is related to the fact that sesameDataCache is not working. I am using the developer's versions and tried several times to restart by deleting RData and Cache folder - NOTHING WORKS!

openSesame(sesameDataGet("EPICv2.8.SigDF")[[1]]) Error in stopAndCache(title) : | File EPICv2.8.SigDF either not found or needs to be cached to be | used in sesame. | Please make sure you have updated ExperimentHub and try | > sesameDataCache("EPICv2.8.SigDF") | or download all data | > sesameDataCache() | to retrieve and cache needed sesame data. sesameDataCache() Metadata (N=11):

sesame_checkVersion() SeSAMe requires matched versions of R, sesame, sesameData and ExperimentHub. Here is the current versions installed: R: 4.3.2 Bioconductor: 3.18 sesame: 1.21.12 sesameData: 1.21.10 ExperimentHub: 2.11.1

seq101 commented 8 months ago

@zwdzwd How to get inferSex to work with Epic V2? I am stuck at sesameDataCache()

zwdzwd commented 8 months ago

Can you just paste this function? https://github.com/zwdzwd/sesame/blob/a5b87518f89bb3fc7c7ac6f6b58037602648b143/R/sex.R#L35 without the platform <- sesameData_check_platform(platform, names(betas)) which would require the cache.

Meanwhile I will see why sesameDataCache() is not working, it's really puzzling.

seq101 commented 8 months ago

@zwdzwd can you please provide description of what data structure 'betas' is supposed to be? Is this supposed to be done sample by sample? The documentation is not clear at all while its amazing in other areas.

Gender prediction and Age Prediction are one of the most useful and valuable functions of Sesame. It would be really nice for new users to have clear instructions. My betas currently has betas for all samples betas = openSesame(idat_dir, BPPARAM = BiocParallel::MulticoreParam(2))

Documentation here denotes sdf to both inferSex and is the input but that seems wrong? https://bioconductor.org/packages/release/bioc/vignettes/sesame/inst/doc/inferences.html

Also, please link this in the 'Age & Epigenetic Clock' section of the Vignettes as it took me very long to find the right file for Epic V2: https://github.com/zhou-lab/InfiniumAnnotationV1/tree/main/Anno/EPICv2

PS: Communication with Illumina sales informed us that going forward only Epic V2 microarray will be supported.

seq101 commented 8 months ago

@zwdzwd Please reply to my query here. Documentation is very unclear on what is the input of inferSex function (betas or sigDF)

This line here indicates that input is sigDF but its being called betas? https://github.com/zwdzwd/sesame/blob/a5b87518f89bb3fc7c7ac6f6b58037602648b143/R/sex.R#L17

seq101 commented 8 months ago

Can you just paste this function?

https://github.com/zwdzwd/sesame/blob/a5b87518f89bb3fc7c7ac6f6b58037602648b143/R/sex.R#L35

without the platform <- sesameData_check_platform(platform, names(betas)) which would require the cache. Meanwhile I will see why sesameDataCache() is not working, it's really puzzling.

@zwdzwd Pasting the function doesn't work. Please help. Did you try this on EpicV2 data? I am getting errors.

zwdzwd commented 8 months ago

The input is a beta value vector. Not a SigDF. Sorry about the confusion but where did it say SigDF? In the older version, we did use SigDF, but thought everything is updated

zwdzwd commented 8 months ago

Oh, I know what's confusing, in the example, the output of openSesame is beta. but input to openSesame is SigDF. But you don't always need to call from openSesame. Yes, EPICv2 should work as shown in the example.

seq101 commented 8 months ago

@zwdzwd Thank you for clearing the confusion. One last thing, can you give an example of calling betas on multiple samples and using inferSex?

betas = openSesame(idat_dir, BPPARAM = BiocParallel::MulticoreParam(2)) # LOAD BETAS for multiple samples how to call inferSex from here? Will it be sample by sample, do we need to loop thru? I am sorry for asking this, but it would be very helpful to your users who don't understand the details of how inferSex is working.

zwdzwd commented 8 months ago

Something like apply(betas, 2, inferSex) or lapply(betas, inferSex)?

seq101 commented 8 months ago

lapply(betas, inferSex)

Error in density.default(na.omit(vals)) : need at least 2 points to select a bandwidth automatically

seq101 commented 8 months ago

@zwdzwd I think I know what may be the problem. The probe names in hypoMALE and hyperMALE are not matching.

For example, The first probe of hyperMALE is cg26359388 but it exists as "cg26359388_BC21" in my betas vector. HAVE you come across this problem? Similarly none of the probes in the hypoMALE or hyperMALE list that have an underscore exist in my betas vector.

zwdzwd commented 8 months ago

Yes, that's where the mLiftOver can be used. https://github.com/zwdzwd/sesame/blob/devel/R/mLiftOver.R Please see https://www.biorxiv.org/content/10.1101/2024.03.18.585415v1 for detail.

something like inferSex(mLiftOver(betas, "EPIC"))

seq101 commented 8 months ago

Yes, that's where the mLiftOver can be used. https://github.com/zwdzwd/sesame/blob/devel/R/mLiftOver.R Please see https://www.biorxiv.org/content/10.1101/2024.03.18.585415v1 for detail.

something like inferSex(mLiftOver(betas, "EPIC"))

It would be good to document these things in the Vignette. I know you are likely going to. Going forward Illumina will only be offering EPICv2 and MSA so demand for analyzing those 2 microarrays will increase.

How does inferSex(mLiftOver(betas, "EPIC")) work with lapply?

seq101 commented 8 months ago

@zwdzwd betas <- mLiftOver(betas, "EPIC") function doesn't work :

Error in dplyr::full_join(): ! Join columns in x must be present in the data. ✖ Problem with prefix. Run rlang::last_trace() to see where the error occurred. Warning message: Unknown or uninitialised column: ID_source.

arualemsti commented 6 months ago

Thanks for the suggestion - I deleted the old sesame 1.20 cache (stored in /Users/[username-here]/Library/Caches/org.R-project.R/R/ExperimentHub on mac) before updating to sesame 1.21 but did not delete any RData/history files, so you may be right.

In the end I reverted to sesame 1.20 again, but this also caused issues and only worked after uninstalling R and RStudio, deleting all packages and history files, and reinstalling everything. So I'm a bit reluctant to try again, but may try on a spare laptop

Hello @bethan-mallabar-rimmer, how were you able to revert back to the old 1.20 sesameData version?

Thank you!

bethan-mallabar-rimmer commented 6 months ago

Thanks for the suggestion - I deleted the old sesame 1.20 cache (stored in /Users/[username-here]/Library/Caches/org.R-project.R/R/ExperimentHub on mac) before updating to sesame 1.21 but did not delete any RData/history files, so you may be right. In the end I reverted to sesame 1.20 again, but this also caused issues and only worked after uninstalling R and RStudio, deleting all packages and history files, and reinstalling everything. So I'm a bit reluctant to try again, but may try on a spare laptop

Hello @bethan-mallabar-rimmer, how were you able to revert back to the old 1.20 sesameData version?

Thank you!

Apologies for delayed response. For reference, after uninstalling and reinstalling the entire R set up as described above (so, not a solution I can recommend), I installed sesame from Bioconductor using BiocManager::install('sesame') instead of from GitHub using BiocManager::install('zwdzwd/sesame'). This worked because the version of sesame on Bioconductor at the time was 1.20. It is now 1.22 and it looks like version 1.20 is no longer on Bioconductor anywhere, so unfortunately this won't work.

@zwdzwd has the issue with sesameDataCache() been solved please?

If the issue is solved it might be best to use the latest version instead of 1.20.