Bioconductor / Contributions

Contribute Packages to Bioconductor
134 stars 33 forks source link

memes #1997

Closed snystrom closed 3 years ago

snystrom commented 3 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.

bioc-issue-bot commented 3 years ago

Hi @snystrom

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: memes
Type: Package
Title: motif matching, comparison, and de novo discovery using the MEME Suite
Version: 0.99.0
Authors@R: person("Spencer", "Nystrom", 
    email = "nystromdev@gmail.com", 
    role = c("aut", "cre", "cph"),
    comment = c(ORCID = "0000-0003-1000-1579"))
Description: A seamless interface to the MEME Suite family of tools for motif
    analysis. 'memes' provides data aware utilities for using GRanges objects as
    entrypoints to motif analysis, data structures for examining & editing motif
    lists, and novel data visualizations. 'memes' functions and data structures are
    amenable to both base R and tidyverse workflows.
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
Imports:
    Biostrings,
    dplyr,
    cmdfun (>= 1.0.2),
    GenomicRanges,
    ggplot2,
    ggseqlogo,
    magrittr,
    matrixStats,
    methods,
    processx,
    purrr,
    rlang,
    readr,
    stats,
    tools,
    tibble,
    tidyr,
    utils,
    usethis,
    universalmotif (>= 1.9.3),
    xml2
Suggests:
    cowplot,
    BSgenome.Dmelanogaster.UCSC.dm3,
    BSgenome.Dmelanogaster.UCSC.dm6,
    forcats,
    testthat (>= 2.1.0),
    knitr,
    MotifDb,
    pheatmap,
    PMCMRplus,
    plyranges (>= 1.9.1),
    rmarkdown,
    covr
biocViews: DataImport, FunctionalGenomics, GeneRegulation, MotifAnnotation, MotifDiscovery, SequenceMatching, Software
RoxygenNote: 7.1.1
SystemRequirements: Meme Suite (v5.1.1 or above) <http://meme-suite.org/doc/download.html>
VignetteBuilder: knitr
URL: https://snystrom.github.io/memes/, https://github.com/snystrom/memes
BugReports: https://github.com/snystrom/memes/issues
Depends: 
    R (>= 4.1)
Config/testthat/edition: 3
bioc-issue-bot commented 3 years ago

A reviewer has been assigned to your package. Learn what to expect during the review process.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot commented 3 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/memes to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 years ago

Received a valid push on git.bioconductor.org; starting a build for commit id: e9f6cda5821a3c50c5ec53beb69393e1b7dc4c51

bioc-issue-bot commented 3 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/memes to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 years ago

Received a valid push on git.bioconductor.org; starting a build for commit id: be0466376370440550e3f1dc83fce7eb7bb6bf71

bioc-issue-bot commented 3 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/memes to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 years ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 705c8f233e349864e3436acd64da9d0b807cbf0e

bioc-issue-bot commented 3 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/memes to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

snystrom commented 3 years ago

This isn't in the install instructions vignette (yet), but I also built a docker container on top of the bioconductor_docker container that has the MEME Suite dependency installed, in case this helps review the package. The container is rebuilt on schedule with the bioc container builds so it's up to date.

It can be accessed as follows:

docker pull snystrom/bioconductor_docker_meme:devel
docker run -e PASSWORD=<password> -p 8787:8787 snystrom/bioconductor_docker_meme
vjcitn commented 3 years ago

Please run BiocCheck in your package source folder as checked out from bioconductor's git repo. I find

$warning
[1] "The following files are over 5MB in size: '.git/objects/pack/pack-a368439aa4aa7ee3afa916aecc0cc4b7d87febad.pack' "

$note
[1] "Avoid '<<-' if possible (found in 3 files)"                                                                                                                                        
[2] "Recommended function length <= 50 lines."                                                                                                                                          
[3] "Usage of dontrun{} / donttest{} found in man page examples."   

and it would be great to clear these, specifically avoiding a large git checkout.

I also noticed in a vignette "Tidying motif metadata":

As of this writing in May 2020, the FBgn entries are out of date

It looks like there are various "TODO"s around. Let me know when you think I should go over this again.

snystrom commented 3 years ago

Thanks, @vjcitn. I think I've resolved the minor comments (I'll post an update when complete), but wanted to ping you on this issue first:

Please run BiocCheck in your package source folder as checked out from bioconductor's git repo. I find

$warning
[1] "The following files are over 5MB in size: '.git/objects/pack/pack-a368439aa4aa7ee3afa916aecc0cc4b7d87febad.pack' "

I went back and ran bfg to clear up the pack files and can't quite get this resolved, although I did drop the size by ~6MB. It's currently floating around 14MB. This is coming in a large part from the gh-pages branch because it has large figures. I'm not sure how to get this down without completely dumping the website, which I think is a valuable resource. Other Bioc packages have a similar feature (ComplexHeatmap comes to mind) which actually causes this same warning. If you have any suggestions for cutting this down I am happy to implement them.

nturaga commented 3 years ago

Hi, you can purge the branches if you think the other branches are causing this issue. Before that, you can back up your repository locally or duplicate it into another repo on Github for the branches. Just a suggestion.

snystrom commented 3 years ago

Thanks @nturaga don't know why I didn't think of that. I did some fiddling and purged the gh-pages branch on a local mirror and that does clear it up (pack goes from 14MB to ~1MB), so that could be the nuclear option.

I guess it is technically possible to get the pkgdown site hosted from another repo that mirrors my dev repo in order to keep the clone size small for you guys. Certainly a lot of moving parts, but doable. Have you run into this issue with other packages and are they able to keep their pkgdown branches together with the source code?

I may try an approach to get knitr to compress the figures to see if that keeps the size low since that would be the simplest solution.

Regardless of the solution, it will probably require a force push from my end since I'm rewriting git history. Should I ping here, or just go ahead and email the devel listserv directly when the time comes for that?

nturaga commented 3 years ago

Email the devel list (bioc-devel).

Glad it worked.

From: Spencer Nystrom @.> Reply-To: Bioconductor/Contributions @.> Date: Wednesday, March 31, 2021 at 4:53 PM To: Bioconductor/Contributions @.> Cc: Nitesh Turaga @.>, Mention @.***> Subject: Re: [Bioconductor/Contributions] memes (#1997)

Thanks @nturagahttps://github.com/nturaga don't know why I didn't think of that. I did some fiddling and purged the gh-pages branch on a local mirror and that does clear it up (pack goes from 14MB to ~1MB), so that could be the nuclear option.

I guess it is technically possible to get the pkgdown site hosted from another repo that mirrors my dev repo in order to keep the clone size small for you guys. Certainly a lot of moving parts, but doable. Have you run into this issue with other packages and are they able to keep their pkgdown branches together with the source code?

I may try an approach to get knitr to compress the figures to see if that keeps the size low since that would be the simplest solution.

Regardless of the solution, it will probably require a force push from my end since I'm rewriting git history. Should I ping here, or just go ahead and email the devel listserv directly when the time comes for that?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/Bioconductor/Contributions/issues/1997#issuecomment-811457887, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAU6QS5QMETWGRVBFPPCKGTTGODULANCNFSM42AN3FAQ.

LiNk-NY commented 3 years ago

FWIW, you may have re-packed the packfiles by pushing to a remote location. You can also do this manually by doing git gc. https://git-scm.com/book/en/v2/Git-Internals-Packfiles

snystrom commented 3 years ago

Thanks, Marcel, I did double check that. The issue was kind of two-fold. I originally hosted the site in master under docs/ then migrated to the gh-pages branch. What this means is even deleting the gh-pages branch kept the older docs in the master commit history, so the double-whammy of deleting the gh-pages plus bfg to drop docs from the commit history was what ultimately cleared it up.

I've moved the site to a separate repo that gets autodeployed on push and never touches the main repo so the size issue is now resolved! That coupled with a redirect from the original url and the site works perfectly, in case this issue shows up for others in the future.

bioc-issue-bot commented 3 years ago

Received a valid push on git.bioconductor.org; starting a build for commit id: a5c44ee865386f8eb0d91b5d10438fcbd0828d02

bioc-issue-bot commented 3 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/memes to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 years ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 1e9cd95c234b4cdff27075a2320bfad8632cfd07

bioc-issue-bot commented 3 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/memes to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 years ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 1b4a2806c1f1140d67bdfc88bf6e51117bd7de45

bioc-issue-bot commented 3 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/memes to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

snystrom commented 3 years ago

Alright, @vjcitn I got the git repo size down. The package is currently throwing an ERROR on windows but it's upstream from rtracklayer so I think it's good to go for a second look.

In response to the biocCheck notes:

[1] "Avoid '<<-' if possible (found in 3 files)"

In this case, I'd prefer to keep my <<- if at all possible. Their use is well scoped inside a purrr::map loop (i.e. the next environment up is scoped within my function environment and won't affect the user's env). I could redo this with a couple of for loops, but purrr gives me a little extra type safety for the final return value and a little extra speed which I'd like to hold onto if that's alright.

[2] "Recommended function length <= 50 lines."

I went through and cleaned up where I could, but it's going to be tricky to get this fully resolved. I generally dislike long functions (only 8 / 188 in the package are >50 lines), but the few instances I have big ones are big for a reason. For example, the plot_ame_heatmap function has lots of plotting code, so it's a little easier to have everything in 1 big function than break things up to avoid bugs from getting the nonstandard eval correct. Since it's difficult to unit test plots, I decided on this approach to hopefully limit bugs. ame_order_by_cluster is large because I did a little ASCII cartoon in the comments to explain the algorithm which is relatively short. In general, I use linebreaks rather frequently since it makes it easier for me to read, so that may be contributing to this as well.

[3] "Usage of dontrun{} / donttest{} found in man page examples."

This is limited to examples of functions that write data to disk to prevent writing junk to the build machine or userspace. In these cases I wrap it in \donttest. On CRAN for similar functions I've used \dontrun for this, but BiocCheck says to prefer \donttest, is that advice still appropriate in this situation?

I also noticed in a vignette "Tidying motif metadata":

As of this writing in May 2020, the FBgn entries are out of date

This issue is actually upstream of memes and comes from the FlyFactor Survey database itself. I added a new paragraph explaining this issue in more detail and fleshing out the reasoning behind using this example. In short, the purpose of the vignette is to give users practical examples on how to use the universalmotif_df format to explore and clean messy data. I lean on this messiness to give good real-world examples for how to evaluate motif annotations and tidy their metadata.

It looks like there are various "TODO"s around. Let me know when you think I should go over this again. I went back through and double checked all my TODO's and ensured I wasn't leaving anything undone or broken. In particular I fixed the ones in the vignettes that were being rendered out now that the latest universalmotif version is up on the devel branch.

I went through and cleaned those user-facing TODOs up by fixing whatever it was I needed to do. There are no more user-facing TODO's. The remaining TODO flags mark things that may need attention in the future if I make changes to that area of the code, but should be stable under current conditions. In other words, these are mostly notes about where to add future features (like to support additional tools from the MEME Suite) than unfinished bits.

bioc-issue-bot commented 3 years ago

Your package has been accepted. It will be added to the Bioconductor nightly builds.

Thank you for contributing to Bioconductor!

vjcitn commented 3 years ago

The master branch of your GitHub repository has been added to Bioconductor's git repository.

To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/snystrom.keys is not empty), then no further steps are required. Otherwise, do the following:

  1. Add an SSH key to your github account
  2. Submit your SSH key to Bioconductor

See further instructions at

https://bioconductor.org/developers/how-to/git/

for working with this repository. See especially

https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/

to keep your GitHub and Bioconductor repositories in sync.

Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at

https://bioconductor.org/checkResults/

(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("memes"). The package 'landing page' will be created at

https://bioconductor.org/packages/memes

If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.