NathanSkene / EWCE

Expression Weighted Celltype Enrichment. See the package website for up-to-date instructions on usage.
https://nathanskene.github.io/EWCE/index.html
53 stars 25 forks source link

EWCE 2.0 #47

Closed bschilder closed 2 years ago

bschilder commented 2 years ago

Note on branches

I previously made all all my changes to the forked repo bschilder/EWCE on the DelayedArray branch. However, something got screwed up with the git history in that fork, and the only way to push to the main repo was to clone the main repo (NathanSkene/EWCE), create a new branch called bschilder_dev, copy and paste all the files with my DelayedArray branch's edits into this repo, and then make a Pull Request from there. This means you won't have the incremental commits I made as I upgraded EWCE, but at least everything will be harmonized moving forward. I'll delete my old forked repo and make any future edits on this NathanSkene/EWCE@bschilder_dev branch.

Upgrades & new features

To do

bschilder commented 2 years ago

@NathanSkene some of these checks might fail, actually, bc we need to add some variables to GitHub Secrets for this repo. It's really quick and easy, so I can walk you through it via Slack on Monday

Al-Murphy commented 2 years ago

Update DESCRIPTION Version to 1.3.1 to reflect latest release

bschilder commented 2 years ago

.github/workflows/check-bioc-docker.yml:

run_docker: 'false' - Is this parameter in use? It is pushing to docker with a commit now correct? Also can you test to ensure a push to docker won't occur if a check is failed (either in EWCE or orthogene)

This feature is not being used at the moment. ideally, it would be most efficient to build the Docker container once, test it, and push to DockerHub all in one workflow, but I couldn't get this working smoothly and opted to just use the separate dockerhub.yml workflow based on scFlow.

bschilder commented 2 years ago

Checklist

Alan and I discussed this. While naming this update 2.0 might be useful for remembering which version had major changes, this isnt allowed by Bioconductor (they dictate the version changes according to their devel/release schedule). Instead, I'll document these updates in the NEWS, README, and vignettes.

R/assign_cores.r:

R/bin_columns_into_quantiles.r

R/bootstrap_enrichment_test.R:

R/calculate_meanexp_for_level.R

R/calculate_specificity_for_level.R

R/controlled_geneset_enrichment.r

R/create_list_network.R

I just tried to break up the code by putting it into smaller functions. You'll have to ask @NathanSkene regarding what this does exactly.

R/create_quadrants.R

Same as above. @NathanSkene

R/ctd_to_sce.R

R/delayedarray_normalize.R

R/drop_nonexpressed_cells.R

R/drop_nonexpressed_genes.R

R/drop_uninformative_genes.r

Not sure why this parameter was originally chosen. @NathanSkene ?

R/dt_to_df.R

R/ewce_expression_data.r


R/ewce_plot.r

R/filter_variance_quantiles.r

R/get_ctd_levels.R

R/get_exp_data_for_bootstrapped_genes.R

From original EWCE. Ask @NathanSkene

R/get_sig_results.R

R/is_celltypedataset.R

R/is_delayed_array.R

R/is_matrix.R

R/is_sparse_matrix.R

R/max_ctd_depth.R

R/merge_sce_list.R

R/message_parallel.R / R/messager.R / R/myScalesComma.R

R/package.R

The package.R file is special, it give users a description of the package when they run ?EWCE.

R/plot_bootstrap_plots.R

From the original EWCE. Ask @NathanSkene

R/plot_log_bootstrap_distributions.R / R/plot_with_bootstrap_distributions.R

R/rNorm.R

From the original EWCE. Ask @NathanSkene

R/read_ctd.R / R/sce_merge_comparable_levels.R / R/zeisel2018_functions.r

They're works in progress.

R/run_deseq2.R

Would be a great add-on but I'd ask that you implement any additional DGE methods you'd like to use.

R/sce_lists_apply.R / R/to_dataframe.R / R/to_delayed_array.R / R/to_sparse_matrix.R

README.Rmd

Correct, this will be updated when this branch is merged

Not sure what you mean. Feel free to edit the README further after merging. These are the current instructions.

if (!require("BiocManager")){install.packages("BiocManager")}

BiocManager::install("EWCE") 

Getting started vignette:

Extended vignette:

Create CTD vignette:

Docker vignette:

All vignettes are accessible via the "Articles" table of the docs website. Will add link in README as well.

Vignettes in general:

The new GHA workflow automatically rebuilds the whole docs website with pkdown and pushes to the gh-pages branch. So manually rebuilding is no longer necessary.


Code coverage is now >84%. All CRAN/Bioc checks are passing locally, but it does take a while. We may have to do some further optimization with tests/examples to keep it below the 15m limit.

Al-Murphy commented 2 years ago

In the description, you have now changed it to 1.3.2 since you made additional changes. However since these first batch of changes weren't pushed to bioconductor it should still be 1.3.1. Can you revert?

bschilder commented 2 years ago

In the description, you have now changed it to 1.3.2 since you made additional changes. However since these first batch of changes weren't pushed to bioconductor it should still be 1.3.1. Can you revert?

Oh i see, didn't realize that bioc doesn't let you skip versions. Np, changing it back.

Al-Murphy commented 2 years ago

Notes:

checking installed package size ... NOTE installed size is 6.9Mb sub-directories of 1Mb or more: doc 6.3Mb

checking top-level files ... NOTE Non-standard file/directory found at top level: ‘doc’

bschilder commented 2 years ago

Add documentation (description and parameters) to:

I'll try to do some of these, but as I mentioned before many of them have parameters that were named by @NathanSkene and I'm unsure what they are. I think this is why we wanted to have the meeting first.

In the meantime, the best I can do for some parameters is simply repeating what the argument name is. I am marking all of these with (#fix) so we can know where they are.

@param hit.exp hit.exp (#fix)

Which original normalization function are you referring to? sct_normalize? I wouldn't expect delayedarray_normalize to be exactly the same since it's a different procedure to what SCT does. However, i can at least confirm that DelayedArrays do indeed work as input to sct_normalize.

I would not expect the results to be exactly the same, since EWCE now uses gene backgrounds generated by orthogene. The goal of this was to improve the accuracy of results EWCE produces (due to more comprehensive ortholog data), in addition to expanding EWCE's applicability to other species.

However, the tests I do have ensure that at least some of the top cell-types are still enriched (see test-bootstrap_enrichment_test.R)

Two CRAN check notes that still need to be sorted:

I first tried preventing all code chunks from running in the extended vignette. But this barely reduced the total package size (6.2 --> 6 Mb). What made a huge difference was merging the vignettes. I guess you were right @Al-Murphy , there really is ton of overhead per vignette!

By merging all 5 vignettes into just 2, I've now managed to get the whole package under 5Mb.

> checking installed package size ... NOTE
    installed size is  6.9Mb
    sub-directories of 1Mb or more:
      doc   6.3Mb

Avoided this by adding ^doc$ to the .Rbuildignore file.

bschilder commented 2 years ago

Yay!!! 🍾

NathanSkene commented 2 years ago

Great work, very happy to see this, getting EWCE working with big datasets was basically the first thing I wanted done back when the lab started, thanks for getting it here!

Merry christmas!


From: Brian M. Schilder @.> Sent: 23 December 2021 11:27 To: NathanSkene/EWCE @.> Cc: Skene, Nathan G @.>; Mention @.> Subject: Re: [NathanSkene/EWCE] EWCE 2.0 (PR #47)

This email from @.*** originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders listhttps://spam.ic.ac.uk/SpamConsole/Senders.aspx to disable email stamping for this address.

Yay!!! 🍾

— Reply to this email directly, view it on GitHubhttps://github.com/NathanSkene/EWCE/pull/47#issuecomment-1000239244, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AH5ZPE2J6QLTVIUITU4IB2DUSMBSZANCNFSM5GSRXSVQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you were mentioned.Message ID: @.***>