Bioconductor / Contributions

Contribute Packages to Bioconductor
134 stars 33 forks source link

CellaRepertorium: data structures and statistical analysis for SC immune repertoires #1622

Closed amcdavid closed 3 years ago

amcdavid commented 4 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.

bioc-issue-bot commented 4 years ago

Hi @amcdavid

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Type: Package
Package: CellaRepertorium
Title: Data structures, clustering and testing for single 
    cell immune receptor repertoires (scRNAseq RepSeq/AIRR-seq)
Version: 0.99.0
Authors@R: 
    c(person(given = "Andrew",
   family = "McDavid",
   role = c('aut', 'cre'),
   email = "Andrew_McDavid@urmc.rochester.edu"),
      person(given = "Yu",
   family = "Gu",
   role = "aut",
   email = "Yu_Gu@urmc.rochester.edu"),
      person(given = 'Erik',
      family = 'VonKaenel',
      role = 'aut'
      ),
      person(given = 'Thomas Lin',
   family = 'Pedersen', 
   role = 'ctb'))
Description: Methods to cluster and analyze high-throughput
    single cell immune cell repertoires, especially from the 10X Genomics
    VDJ solution. Contains an R interface to CD-HIT (Li and Godzik 2006).   
    Methods to visualize and analyze paired heavy-light chain data.
    Tests for specific expansion, as well as omnibus oligoclonality under
    hypergeometric models.
License: GPL-3
Depends: 
    R (>= 3.5.0)
Imports:
    dplyr,
    tibble,
    stringr,
    Biostrings,
    Rcpp,
    reshape2,
    methods,
    rlang,
    purrr,
    Matrix,
    S4Vectors,
    tidyr,
    forcats,
    progress,
    stats,
    utils
Suggests: 
    testthat,
    readr,
    knitr,
    rmarkdown,
    ggplot2,
    BiocStyle,
    ggdendro,
    broom,
    lme4,
    RColorBrewer,
    SingleCellExperiment,
    scater,
    broom.mixed
LinkingTo: 
    Rcpp
VignetteBuilder: 
    knitr
Encoding: UTF-8
LazyData: true
NeedsCompilation: yes
RoxygenNote: 7.1.1
URL: https://github.com/amcdavid/CellaRepertorium
BugReports: https://github.com/amcdavid/CellaRepertorium/issues
Roxygen: list(markdown = TRUE)
biocViews: RNASeq, Transcriptomics, SingleCell, TargetedResequencing, 
    Technology, ImmunoOncology, Clustering
bioc-issue-bot commented 3 years ago

A reviewer has been assigned to your package. Learn what to expect during the review process.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot commented 3 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CellaRepertorium to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 years ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 0dd155de8f484bb221e4dd4d6b83d49c29065932

bioc-issue-bot commented 3 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CellaRepertorium to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

amcdavid commented 3 years ago

Well that seemed to clean things up, aside from a mysterious error in the Mac binary build

FIXING LINKS FOR libdir/CellaRepertorium/libs//CellaRepertorium.so

install_name_tool -change "/usr/local/lib/libgcc_s.1.dylib" "/Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libgcc_s.1.dylib" "libdir/CellaRepertorium/libs//CellaRepertorium.so" ERROR: R installation problem: File /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libgfortran.3.dylib not found!

I don't have any fortran code. I can build and install fine on my Mojave Mac, so I'm at a loss about how to reproduce or debug this issue.

bioc-issue-bot commented 3 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CellaRepertorium to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

lshep commented 3 years ago

Thank you for your submission to Bioconductor. Please see initial review below:

build report

branch

README

DESCRIPTION

inst

docs

data-raw

data / man

vignette

cdr3_clustering

# This maybe should be a turned into a function 
# Other plots should be considered:

For what its worth, anything that can be turned into a function to make the end users experience easier/quicker is recommended.

mouse_tcell_qc

reperoire_and_expression

Please address the above issues/concerns. When ready be sure to bump the version to trigger a new build and comment back here on the updates. Cheers

bioc-issue-bot commented 3 years ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 33a826fdd29aecdfb006f209f5b93f3926f60556

amcdavid commented 3 years ago

Hi Lori,

Thanks for the detailed review. I have responded line-by-line (or just made the change if self-explanatory) below. Note that I have not yet tackled the vignette improvements or the comments about the Check NOTES/examples but agree with your comments -- I just didn't want to lose my progress so far if my browser crashes! I will edit this inline as I make the remaining changes and @ you when done.

Thank you for your submission to Bioconductor. Please see initial review below:

build report

  • [x] Please consider fixing the NOTE about objects/function binding
  • [x] Please justify not having runnable examples in the man pages identified in the build report
  • [x] We highly recommend having a NEWS file to keep track of updates/changes. It will be included in future release announcements of Bioconductor. An example release announcement can be found here. See ?news or news

branch

  • [x] We like only necessary files in the Bioconductor versions of packages. Could you seperate out the pkgdown files, docs directory, extra README to a different branch rather than the master branch. same for inst/examples this doesn't seem necessary for package function

RE: inst/examples factors out repeated code for an example used across multiple manual pages. The code in that directory is pulled in via the roxygen2 @example tag. I added a comment to it to make it clear to others its function, but I guess it could live under man/ and/or be added to .Rbuildignore if that's a problem to have it built in the package tarball.

README

  • [x] Should only be one
  • [x] Add Bioconductor installation instructions

DESCRIPTION

  • [x] Please remove LazyData: true. We have rarely found this useful and can actually slow installation of packages

inst

  • [x] What and how is the WORDLIST in inst utilized within the package?

This is used for spell checking with devtools::spell_check(), would argue it's part of the build-chain of the package, have added it to .Rbuildignore though.

  • [x] Have an inst/script that describes how the data in inst/extdata was created. It can be code, sudo-code, or text but should include reference/source information.

This info was in data-raw, have moved to inst/script and removed data-raw

docs

  • [x] Seems necessary for pkgdown. pkgdown necessary files should be a separate branch than the master branch and not included in the git branch that Bioconductor clones.

data-raw

  • [x] Package data should be in data or inst/extdata. What is this and how is it utilized? I think this belongs in an inst/script directory instead?

Ok, moved. FWIW, this is Hadley Wickham's convention and there's some nice tooling that supports it in devtools.

data / man

  • [x] Can you expand on the description and source information for ccdb_ex

This is just a pre-constructed form of contigs_qc as described in the format. Have added a crossref to data(contigs_qc)

vignette

  • [x] I recommend having at least one vignette that gives an overview of the package or the other vignettes. It is more intuitive for a naive user to do vignette("CellaRepertorium") to look for help or guidance.
  • [x] Include sessionInfo() at the end of each vignette.

_cdr3clustering

  • [x] It might be helpful to explain why you are filtering, mutating, etc the data in each step.
  • [x] Remove internal comments
# This maybe should be a turned into a function 
# Other plots should be considered:

For what its worth, anything that can be turned into a function to make the end users experience easier/quicker is recommended.

  • [x] Proceed with making the functions for the complex plots.

I have made some attempt to factor out some of the plotting into the package.

_mouse_tcellqc

  • [x] For setting up all_anno, do all the col_types have to be specified? This seems like a lot of set up arguments.

These can be copy-pasted from the messages that read_csv spits out to console. I have added an explanation at to why it might be necessary to specify the column types (logical vs character columns).

  • [x] Fix this
#> Warning: `cols` is now required when using unnest().
#> Please use `cols = c(anno)`
  • [x] If this is meant as an exploratory or come back to maybe remove from vignette
  Is there a way to evaluate a sensitivity/specificity in distinguishing cells from debris, or T cells from other cell types?
  • [x] Section 6 should not be visible to users. This seems like internal come back to questions/considerations.

_reperoire_andexpression

  • [ ] Will the barcodes be in the same location each time? Shouldn't there be an accessor method to retrieve useful data like this rather than the cumbersome manual subsetting.

If this is referring to the colData(sce)$alpha business, I'm open to suggestions. I will say that I did think about how to design this for some time, and this seemed like the least bad way to represent that fact that multiple cell views of the repertoire are useful. The other alternative I considered seriously was to subclass SingleCellExperiment into a SingleCellPlusRepertoire say, which contains SingleCellExperiment and has a ContigCellDB slot. Then I can provide methods/generics to my heart's content. However, to provide different views of the cell_tbl seemed to mean there'd be a mutable state to the ContigCellDB representing which view was active. Though there are some packages that use this semantic (eg, data.table), I have always found it un-R-like since most methods do not have sideeffects (aside from <- assignment). There might be some way to save a few keystrokes, but I really did give this some thought (and it has been tested in other settings besides this vignette).

  • [x] This warning generated seems fixable?
#> Warning: `cols` is now required when using unnest().
#> Please use `cols = c(extrabc)`
  • [x] Why do you need the "extra" barcodes generated?

This is to mimic the typical and realistic scenario where the cells in SingleCellExperiment object are a strict superset of the barcodes in the repertoire experiment. Explanation added.

  • [ ] Seems like all the plots are empty? Please correct. This occurs when running interactive as well.

All the plots in this vignette? I have not been able to replicate this in interactive or looking at the built vignettes on my machine. I have no idea why this would be occurring, given that I'm using scater functionality here. Let me know if this is still an issue.

Please address the above issues/concerns. When ready be sure to bump the version to trigger a new build and comment back here on the updates. Cheers

bioc-issue-bot commented 3 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CellaRepertorium to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 years ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 698d1fbe471847c76d5cb977b0383b3b5611c04e

bioc-issue-bot commented 3 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CellaRepertorium to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

lshep commented 3 years ago

Thank you for your progress so far and the progress update for changes. When the vignette is ready please message and I'll begin a re-review. Cheers!

bioc-issue-bot commented 3 years ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 68099a79fb92bbf395dab7ff4f4b357787f6ee94

bioc-issue-bot commented 3 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CellaRepertorium to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

lshep commented 3 years ago

This is a reminder the next Bioconductor release is October 28th. All packages must be accepted by October 21st to be included in that release else it will remain in Bioconductor devel until the next release (normally April).

lshep commented 3 years ago

Do you think the vignette changes will be ready for review before the deadline for this release?

amcdavid commented 3 years ago

Yes, they are mostly complete...hopefully have everything committed later this evening

bioc-issue-bot commented 3 years ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 13ca0fd5c74d864f3e9e366c418c836945c72df0

bioc-issue-bot commented 3 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CellaRepertorium to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

amcdavid commented 3 years ago

@lshep just a note that I have addressed (or at least mitigated ;) ) the comments you made.

lshep commented 3 years ago

In repertoire_and_expression vignette the plots are still showing up empty for me. I think it has to do with transparency given the warnings

#> transparency is not supported on this device: reported only once per page

transparency is only available on particular OS. I think if you add point_alpha = 1 it will remedy the issue

bioc-issue-bot commented 3 years ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 7614de86e2aa988261a3f26f71251ae8f70068d7

amcdavid commented 3 years ago

Ok, I disabled alpha. I guess the ideal fix would be to change the graphics device or output to something that supports transparency on all platforms.

bioc-issue-bot commented 3 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CellaRepertorium to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 years ago

Your package has been accepted. It will be added to the Bioconductor nightly builds.

Thank you for contributing to Bioconductor!

mtmorgan commented 3 years ago

The master branch of your GitHub repository has been added to Bioconductor's git repository.

To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/amcdavid.keys is not empty), then no further steps are required. Otherwise, do the following:

  1. Add an SSH key to your github account
  2. Submit your SSH key to Bioconductor

See further instructions at

https://bioconductor.org/developers/how-to/git/

for working with this repository. See especially

https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/

to keep your GitHub and Bioconductor repositories in sync.

Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at

https://bioconductor.org/checkResults/

(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("CellaRepertorium"). The package 'landing page' will be created at

https://bioconductor.org/packages/CellaRepertorium

If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.