Bioconductor / Contributions

Contribute Packages to Bioconductor
131 stars 33 forks source link

epiregulon #3271

Closed TomVuod closed 2 months ago

TomVuod commented 5 months ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.

bioc-issue-bot commented 5 months ago

Hi @TomVuod

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: epiregulon
Title: Gene regulatory network inference from single cell epigenomic data
Version: 0.99.0
Date: 2023-11-21
Authors@R: 
    c(person(given = "Xiaosai", family = "Yao", role = c("aut", "cre"), email = "yao.xiaosai@gene.com", comment = c(ORCID = "0000-0001-9729-0726")),
    person(given = "Tomasz", family = "Włodarczyk", role = c("aut"), email = "tomwlo@gmail.com", comment = c(ORCID = "0000-0003-1554-9699")),
    person(given = "Aaron", family = "Lun", role=c("aut"), email="infinite.monkeys.with.keyboards@gmail.com"),
    person(given = "Shang-Yang", family = "Chen", role = "aut", email = "sychen9584@gmail.com"))
Description: Gene regulatory networks model the underlying gene regulation hierarchies that drive gene expression and observed phenotypes. Epiregulon infers TF activity in single cells by constructing a gene regulatory network (regulons). This is achieved through integration of scATAC-seq and scRNA-seq data and incorporation of public bulk TF ChIP-seq data. Links between regulatory elements and their target genes are established by computing correlations between chromatin accessibility and gene expressions.
License: file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
Imports:
    BiocFileCache,
    BiocParallel,
    Matrix,
    Rcpp,
    S4Vectors,
    SummarizedExperiment,
    bluster,
    checkmate,
    entropy,
    lifecycle,
    methods,
    scran,
    scuttle,
    stats,
    utils,
    scMultiome,
    GenomeInfoDb,
    GenomicRanges,
    AUCell,
    BSgenome.Hsapiens.UCSC.hg19,
    BSgenome.Hsapiens.UCSC.hg38,
    BSgenome.Mmusculus.UCSC.mm10,
    motifmatchr
Depends: 
    R (>= 4.2.0),
    SingleCellExperiment
Suggests: 
    knitr,
    rmarkdown,
    parallel,
    BiocStyle,
    testthat (>= 3.0.0),
    coin,
    scater
LinkingTo: Rcpp
VignetteBuilder: knitr
URL: https://github.com/xiaosaiyao/epiregulon/
biocViews: GeneExpression, Transcription, ChipOnChip, DifferentialExpression, GeneTarget, Normalization, GraphAndNetwork
Config/testthat/edition: 3
BugReports: https://github.com/xiaosaiyao/epiregulon/issues
vjcitn commented 5 months ago

I had a quick look. I think it would be helpful to readers/reviewers if the vignette referenced the online book https://xiaosaiyao.github.io/epiregulon.book/ . The vignette introduction is too terse and does not explain terms used.

xiaosaiyao commented 5 months ago

Thank you @vjcitn. We've now added the reference to the book in the vignette

lshep commented 5 months ago

We normally don't allow data to be hosted on an untrusted source like github, dropbox, etc. Can the motif data be submitted to a more stable location like zenodo?

xiaosaiyao commented 5 months ago

Hi @lshep chromVARmotifs is a highly cited and widely used package. https://github.com/GreenleafLab/chromVARmotifs

We prefer to download directly from the source. I hope that's acceptable.

lshep commented 5 months ago

The chromVARmotifs has not been updated in 7 years, has several open issues with no response, and there is no set of action to perform any sort of systematic checking (like R CMD build, R CMD check, etc.) which is why Bioconductor does not allow github only packages to be listed as dependencies and only allows packages on CRAN or Bioconductor to be a dependency. The same is true for data pulled from those sources; there is no clear indication of maintenance or keeping data up-to-date. chromVARmotifs should submit their package to CRAN or Bioconductor; or at least add their data to a public curation site like zenodo, microsoft data lakes, or public S3 bucket.

TomVuod commented 4 months ago

Hello @Ishep, epiregulon in no longer dependent on chromVARmotifs repository. The data sets have been included in scMultiome Bioc package and are retrieved from AnnotationHub (commit 355c2f9). We are now waiting for the release of rhdf5 package v2.47.4 since the bug in the current version causes error when vignette is being built (https://github.com/grimbough/rhdf5/issues/139).

lshep commented 4 months ago

It looks like the builders missed the Sat build. I would expect the new version of rhdf5 to be available later today, after today's builds and propagation, once that happens I'll re-evaluate the package for the next stage of review.

lshep commented 4 months ago

You still have a library call to chromVARmotifs in the documentation. Please remove. Since you now call AnnotationHub/ExperimentHub you will likely need to list them in your DESCRIPTION file appropriately in Suggests

vjcitn commented 4 months ago

Can you use an OSI-approved license for this package? https://opensource.org/licenses/

Fiscal sponsorship through NumFocus obligates us to support software packages with open source licensing.

xiaosaiyao commented 4 months ago

@vjcitn I just consulted with our legal team. We would prefer to stick to our existing license which gives open access to all academic centers. The only restriction we impose is for commercial entities. I hope our license is compliant and will not block our submission to BioC.

TomVuod commented 4 months ago

Hello @Ishep , the documentation and DESCRIPTION has been corrected.

vjcitn commented 4 months ago

https://contributions.bioconductor.org/license.html is very clear that restrictive licensing (e.g., academic only) is not possible. The project fiscal sponsorship has required us to avoid restrictive licensing. I think that if you cannot use open licensing we probably shouldn't pursue this submission, really sorry about that.

xiaosaiyao commented 3 months ago

@vjcitn regretfully we can't change the LICENCE (at least for now) and have to withdraw this submission. FYI @TomVuod

vjcitn commented 3 months ago

I am sorry about this. I will have further discussions with NumFocus to see if there is a way forward.

xiaosaiyao commented 3 months ago

@vjcitn we have received the green light to change the license. It is now MIT.

FYI @TomVuod

vjcitn commented 3 months ago

That's great, we'll start the review process soon.

vjcitn commented 3 months ago

I've updated my source archive. A few questions to be resolved before review begins.

"A long format dataframe, representing the inferred regulons, is then generated. The dataframe consists of three columns:

tf (transcription factor) target gene peak to gene correlation between tf and target gene"

but the code output does not produce such a data frame?

regulon <- getRegulon(p2g = p2g, overlap = overlap, aggregate = FALSE)
regulon
#> DataFrame with 746875 rows and 10 columns
#>          idxATAC         chr     start       end    idxRNA      target
#>        <integer> <character> <integer> <integer> <integer> <character>
#> 1             37        chr1   1020509   1021009        22  AL645608.4
#> 2             37        chr1   1020509   1021009        22  AL645608.4
#> 3             37        chr1   1020509   1021009        22  AL645608.4

image

xiaosaiyao commented 3 months ago

@vjcitn Thank you for reviewing the vignette. I apologize for not updating some of the text earlier. The vignette should be cleaned up. I provided further details on the package in the introduction. Let me know should you require more changes.

bioc-issue-bot commented 3 months ago

Your package has been added to git.bioconductor.org to continue the pre-review process. A build report will be posted shortly. Please fix any ERROR and WARNING in the build report before a reviewer is assigned or provide a justification on why you feel the ERROR or WARNING should be granted an exception.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. All changes should be pushed to git.bioconductor.org moving forward. It is required to push a version bump to git.bioconductor.org to trigger a new build report.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot commented 3 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): epiregulon_0.99.6.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/epiregulon to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 months ago

A reviewer has been assigned to your package for an indepth review. Please respond accordingly to any further comments from the reviewer.

xiaosaiyao commented 3 months ago

@lshep do I have the permission to push to upstream? or only @TomVuod has?

lshep commented 3 months ago

The BiocCredentials account is now wrong and corrupt for your two account. We assume the submitter is the maintainer and match the submitter github userid with the maintainer email in the DESCRIPTION. So now yao.xiaosai@gene.com is associated with TomVuod and not xiaosaiyao . I will delete and reset the account and send activation instructions. Do you both need push access?

xiaosaiyao commented 3 months ago

@lshep It will be great to have both. Thank you!

lshep commented 3 months ago

Please both look for an email entitled BiocCredentials account

xiaosaiyao commented 3 months ago

It works now. Thank you! @lshep

bioc-issue-bot commented 3 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 60eea5bc572011faf6006a985e20376f315fc5db

bioc-issue-bot commented 3 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): epiregulon_0.99.7.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/epiregulon to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

TomVuod commented 3 months ago

AdditionalPackage: https://github.com/xiaosaiyao/epiregulon.extra

bioc-issue-bot commented 3 months ago

Hi @TomVuod, Thanks for submitting your additional package: https://github.com/xiaosaiyao/epiregulon.extra. We are taking a quick look at it and you will hear back from us soon.

bioc-issue-bot commented 3 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 6021873a694218aa6259bbab65844e92511459fe

bioc-issue-bot commented 3 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): epiregulon_0.99.8.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/epiregulon to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 813e6cb01a620e2bc599c1456d8c32a43a1b583b

bioc-issue-bot commented 3 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): epiregulon_0.99.9.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/epiregulon to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

TomVuod commented 3 months ago

Hello @lshep can I also have the push access to the epiregulon.extra which has just been submitted as a related package? FYI @xiaosaiyao

lshep commented 3 months ago

Once we moderate it yes. We will have to take a quick look and add it to git. I will let you know when it is active

jianhong commented 3 months ago

Package 'epiregulon' Review

Thank you for submitting your package to Bioconductor. The package passed check and build. However there are several things need to be fixed. Please try to answer the comments line by line when you are ready for a second review. Code: Note: please consider; Important: must be addressed.

The DESCRIPTION file

The NAMESPACE file

R code

bioc-issue-bot commented 2 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: de1bd130ac267271f80b5625167aaf79637843f1

TomVuod commented 2 months ago

Hello @jianhong . Thanks for your review. Here is our response.

Important: R version is not clear in DESCRIPTION.

Added.

Important: Selective imports using importFrom instead of import all with import.

Corrected.

Important: Confusion version numbers in NEWS.md file. You can use commit version for the version before submit to Bioconductor to avoid that.

Confusing versions were replaced with git commit hashes.

NOTE: :: is not suggested in source code unless you can make sure all the packages are imported.

Thanks for this suggestion, we will consider using it in the future R package development. Our intention was to increase code readability and avoid conflicts between name spaces. We will try to remember about vulnerabilities you pointed out during package changes.

Vectorize: for loops present, try to replace them by *apply funcitons.

Some of the for loops were replaced with lapply function. In cases when it is still used we create a matrix with appropriate size to gather the results. Thus, vectorization is not supposed to improve performance significantly. We also kept for in places where interaction is performed over the cluster since only several clusters will be defined in a typical use scenario.

Important: Remove unused code.

Done.

Important: Please consider to add drop=FALSE to avoid the reduction of dimension for matrices and arrays. Ignore this if using datatable.

We have made relevant changes expect for the lines in which slice is assigned to another array since that operation guarantees that the dimensionality of the object on the left-hand side of assignment is preserved like in the following toy example: matr_1[i,] <- matr_2[i,]

NOTE: Functional programming: code repetition.

We wrote new functions to encapsulate the repeated code blocks.

Note: Installation section in Vignette should use BiocManager. Use eval=FALSE to avoid executation. Remove the comment character for that part. Done

NOTE: Makevars and Makefile not within a package when CPP version may be lower than 0.11.0

I’m not sure if that is something that needs to be fixed.

bioc-issue-bot commented 2 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 641463e0843910401d830f6282daf5d4f05c7ee8

bioc-issue-bot commented 2 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: ERROR before build products produced.

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/epiregulon to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 2 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 763b865ff83c71b417904168270b5d930602999a

bioc-issue-bot commented 2 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: ERROR before build products produced.

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/epiregulon to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

jianhong commented 2 months ago

@TomVuod Could you please fix the error? Thank you.

xiaosaiyao commented 2 months ago

@jianhong the error is due to a recent update to scMultiome, the accompanying data package for epiregulon. We added new entries such that the query result is no longer unique. We have already pushed the changes to scMultiome but since the data package only gets built on Tuesdays and Thursdays, we have to wait till the changes to take effects.

https://bioconductor.org/checkResults/3.19/data-experiment-LATEST/scMultiome/

The checks on our local copy are otherwise okay.

TomVuod commented 2 months ago

Hello @lshep , I see that scMultiome has successfully been rebuild yersterday (https://bioconductor.org/checkResults/3.19/data-experiment-LATEST/scMultiome/), but the new version is still not available through BiocManager. Can you update our package?

TomVuod commented 2 months ago

OK - I see that the new version is available now.

bioc-issue-bot commented 2 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 6ba3d3f1903f23623f51de5d2978f3bbdfdc6d91

bioc-issue-bot commented 2 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): epiregulon_0.99.16.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/epiregulon to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.