NathanSkene / EWCE

Expression Weighted Celltype Enrichment. See the package website for up-to-date instructions on usage.
https://nathanskene.github.io/EWCE/index.html
53 stars 25 forks source link

Enable offline runs of EWCE with ExperimentHub #35

Closed Al-Murphy closed 1 year ago

Al-Murphy commented 3 years ago

Enable EWCE to run offline by using local cache of ExperiementHub. Issue was highlighted here by a user.

This will require:

chrisclarkson commented 1 year ago

Hi Thank you for making this resource available! I too work on a cluster that is closed off from the internet (which is obviously beyond your control)... which is making it difficult to use your package: image I tried testing the ExperimentHub problem directly (with localHub=T): image

and localHub=F: image

and so I tried downloading the sqlite database directly:

image

And it seems to have worked...

Looking inside the ~/.cache/R/ExperimentHub/ databases: image It seems that certain data might be missing from the resource table. But the downloaded experimenthub.sqlite data seem to be intact(?): image

Hence I am wondering if there is some way that I can just add the experimenthub data manually to your BiocFilecache database to get the package working for me....?

A lofty ask- I know- but appreciate any advice you can give. Kind Regards

Al-Murphy commented 1 year ago

Hey,

Thanks for your message. I have been meaning to make the change so you can run EWCE offline with the localHub approach for some time now so this has convinced me to do it. I will update this issue once it's done.

In terms of your ask though, I don't think we could/should add this data to the BiocFileCache given that adding these to the experimenthub database is the best practice and gives use a single location for version control, access etc. When I do make the change to enable localHub offline runs I think your best option is to get on to the bioconductor maintainer of experimenthub to find out how best to cache a local version of the files you need and copy them to a location in your closed cluster where they can be accessed when you run your analysis. Otherwise, it might be worth running this step locally, often EWCE runs don't use a lot of RAM although this does vary vastly on the dataset.

Hope this helps! Alan.

chrisclarkson commented 1 year ago

Thanks for the quick reply- so ultimately I should be able to copy some data onto my local cluster and use one of your inhouse EWCE commands to add these data in a way that I can then analyse them with EWCE::plot_mean etc. ?

Al-Murphy commented 1 year ago

The necessary changes have now been made (>= v 1.7.1) to run EWCE offline - the instigated functions have been documented, I've also updated the vignettes so check these for more info.

On you issue, I'm not sure this will work as I haven't done it before (why I think you should get in touch with the maintainers of Experimenthub) but it makes sense to me since the caching stores a version of the experimenthub database that you could just copy this from your local machine to your HPC. You would need to be sure your HPC experimenthub call is pointing to the correct location.

I'm going to close this issue since this is all we can do from our end with EWCE but do feel free to reopen if you have any issues running EWCE with localhub = TRUE