Bioconductor / Contributions

Contribute Packages to Bioconductor
134 stars 33 forks source link

HiCExperiment #2879

Closed js2264 closed 1 year ago

js2264 commented 1 year ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.

bioc-issue-bot commented 1 year ago

Hi @js2264

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: HiCExperiment
Title: Interface to Hi-C files in R
Version: 0.99.0
Date: 2022-11-10
Authors@R:
    c(person(given = "Jacques",
 family = "Serizay",
 role = c("aut", "cre"),
 email = "jacquesserizay@gmail.com",
 comment = c(ORCID = "0000-0002-4295-0624")))
Description: R interface to (m)cool files and other Hi-C processed file formats.
    `.(m)cool`, `.hic`, HiC-Pro generated files and `.pairs` files can be 
    directly parsed and subsetted to create HiCExperiment objects, 
    allowing a memory-efficient representation of Hi-C data in R. 
License: MIT + file LICENSE
URL: https://github.com/js2264/HiCExperiment
BugReports: https://github.com/js2264/HiCExperiment/issues
Depends:
    R (>= 4.2)
Imports:
    InteractionSet,
    GenomicInteractions,
    strawr, 
    GenomicRanges,
    IRanges,
    GenomeInfoDb,
    S4Vectors,
    BiocGenerics,
    BiocIO,
    methods,
    rhdf5,
    Matrix,
    vroom,
    tibble,
    tidyr,
    dplyr,
    glue,
    stringr
Suggests:
    HiContactsData,
    testthat (>= 3.0.0),
    BiocStyle,
    knitr,
    rmarkdown
biocViews:
    HiC,
    DNA3DStructure, 
    DataImport
Encoding: UTF-8
VignetteBuilder: knitr
LazyData: false
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
Config/testthat/edition: 3
Collate: 
    'AllGenerics.R'
    'PairsFile-class.R'
    'HiCExperiment-class.R'
    'ContactsFile-class.R'
    'ContactsFile-methods.R'
    'CoolFile-class.R'
    'CoolFile-methods.R'
    'HiCExperiment-methods.R'
    'HicFile-class.R'
    'HicFile-methods.R'
    'HicproFile-class.R'
    'HicproFile-methods.R'
    'PairsFile-methods.R'
    'checks.R'
    'data.R'
    'globals.R'
    'import-methods.R'
    'parse-cool.R'
    'parse-hic.R'
    'parse-hicpro.R'
    'parse-other.R'
    'parse-pairs.R'
    'utils.R'
js2264 commented 1 year ago

Hi @vjcitn ,

Just to let you know, this package is mostly a reformating of HiContacts package, published in Bioconductor since 3.16. HiContacts was originally reviewed by Marcel @LiNk-NY (issue here). Upon their suggestion, I splitted the package in two and created HiCExperiment package to define parsing methods and classes. I am happy for any Bioconductor member to review this package, of course.

FYI, there is also an upcoming HiCool package, which will be submitted if & when HiCExperiment is accepted. All of this is summarized in 2 figures in the HiCExperiment README (https://github.com/js2264/HiCExperiment).

Let me know if I can provide any further information regarding this package. Best, J

vjcitn commented 1 year ago

You don't have to delay submitting the HiCool package if it requires HiCExperiment ... there is a protocol for managing/reviewing related packages together.

js2264 commented 1 year ago

Fair enough, I'll provide HiCool as an AdditionalPackage when a reviewer is assigned then. Thanks for your suggestion.

bioc-issue-bot commented 1 year ago

A reviewer has been assigned to your package. Learn what to expect during the review process.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/HiCExperiment to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: c2a9f6d31ae8b18543603705bdf6aaf365177909

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/HiCExperiment to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 6323b69064dfd24f4ee6ed6c7360f6d71ee468bc

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/HiCExperiment to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

js2264 commented 1 year ago

Hi @DarioS, Thank you for agreeing to review my package. I have fixed all the errors/warnings. There is a single warning remaining, due to tests exceeding Bioc 10min limit. I'll work on reducing this length, but I think package is ready for review now. Let me know if you need any information, Best, J

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: d13958b66a593bfcd0ff33411e20a028270a1f4e

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/HiCExperiment to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 09647a33fb05514431e90328ad55c0125802b536

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/HiCExperiment to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: b8cd6c6762a15183571c99976578dea1dec58e65

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/HiCExperiment to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

js2264 commented 1 year ago

@DarioS ,

The only remaining warning is for merida1 (macOS machine), which consistently takes between 10 and 11 minutes to perform R CMD CHECK. Of note, nebbiolo1 (Linux) SPB runner takes under 10min, and the three OSes run <10min using GHA (including MacOS runner: https://github.com/js2264/HiCExperiment/actions/runs/3839418414/jobs/6537156295#step:2:5932 ). I guess the SPB AWS instance is somewhat weaker/struggling more than Github runners.

I would be happy to either (1) further reduce tests or (2) switch to long tests, in your opinion, is any of these options necessary? If so, let me know which one you prefer and I'll make required changes. Best, J

js2264 commented 1 year ago

AdditionalPackage: https://github.com/js2264/HiCool

bioc-issue-bot commented 1 year ago

Hi @js2264, Thanks for submitting your additional package: https://github.com/js2264/HiCool. We are taking a quick look at it and you will hear back from us soon.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: ac9c95235cf976421bdcb0e0db5a6c24893442d8

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/HiCExperiment to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

js2264 commented 1 year ago

AdditionalPackage: https://github.com/js2264/fourDNData

bioc-issue-bot commented 1 year ago

Hi @js2264, Thanks for submitting your additional package: https://github.com/js2264/fourDNData. We are taking a quick look at it and you will hear back from us soon.

js2264 commented 1 year ago

Hi @DarioS, It's been two weeks since the review has started; just wanted to know where were you in the reviewing process, and if I could provide any help! Thanks, J

DarioS commented 1 year ago

In Australia, we just returned to work this week from summer holidays, so the formal review of HiCExperiment will begin in the next couple of days. I enquired about the time-out on merida1 with the Bioconductor core team on Monday and one of them replied today to just igore that issue. It's interesting how it takes much longer than the GitHub Actions computer does.

DarioS commented 1 year ago

HiCExperiment and its companion packages represent an impressive framework for storing and analysing chromatin interaction data. It has good interoperability with existing S4 classes such as InteractionSet and ContactMatrix. Each S4 class has a show method defined to print it nicely to the R console. Only a few minor issues are identified.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: d5f55fd1df854cadac62e09eb7b0afaa09d4e21f

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/HiCExperiment to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

js2264 commented 1 year ago

Hi @DarioS, Thank you for your review. Listed below are the commits I have made to answer your comments.

  • For consistency in style with other packages in Bioconductor ecosystem, please use camelCase variable naming. Please refer to Variable Names section of R Code chapter of the contributors guide.

I have amended the code to limit the use of snake/.case. All functions use camelCase, except check and is_, which follow a more standard R naming convention.

Related to this, please use variable names which are meaningful to people who might join the project in the future (e.g. You begin supervising a research student who also helps to maintain the package code). Variable names such as an1, re, xx and gi are not immediately undestandable for others. Please see self-documenting code for advice.

I have limited the use of non-self-explanatory functions as much as possible. gi (GInteractions), an (anchors), re (regions) are "shortcuts" that mimic the way objects from other classes (e.g. se for SummarizedExperiments, gr for GRanges, ...) are referred to in other packages' documentation. Let me know if you still would like this to be changed.

  • Some of the function parameters lack meaningful descriptions in the documentation. For example,

    file: file resolution: resolution bed: bed

    Please provide meaningful, full-sentence descriptions of parameters. For example, file could be described as "A character vector of length 1 specifying the file path to the HiC-Pro matrix file." It is ambiguous from the one-word description above if you want the user to provide an open file connection using base::file() or simply to specify the path to a file.

Done. Again, let me know if you think more details are necessary.

  • Some examples are very long and lack interpretability. Please break these down into sections of a few lines each separated by a newline and add a comment above or below each section to explain to the end-user what each section of commands does. For example, HiCExperiment.Rd's Examples section has:

    mcool_path <- HiContactsData::HiContactsData('yeast_wt', 'mcool')
    contacts_yeast <- HiCExperiment(mcool_path, resolution = 16000, focus = 'II')
    contacts_yeast
    resolutions(contacts_yeast)
    resolution(contacts_yeast)
    focus(contacts_yeast)
    scores(contacts_yeast)
    tail(scores(contacts_yeast, 1))
    tail(scores(contacts_yeast, 'balanced'))
    scores(contacts_yeast, 'test') <- runif(length(contacts_yeast))
    tail(scores(contacts_yeast, 'test'))
    data(centros_yeast)
    topologicalFeatures(contacts_yeast, 'centromeres') <- centros_yeast
    topologicalFeatures(contacts_yeast, 1)
    topologicalFeatures(contacts_yeast, 'centromeres')
    pairsFile(contacts_yeast)
    fileName(contacts_yeast)
    interactions(contacts_yeast)
    length(contacts_yeast)
    contacts_yeast[seq_len(10)]
    seqinfo(contacts_yeast)
    bins(contacts_yeast)
    anchors(contacts_yeast)
    regions(contacts_yeast)
    contacts_yeast
    as(contacts_yeast, 'GInteractions')
    as(contacts_yeast, 'ContactMatrix')
    as(contacts_yeast, 'matrix')[seq_len(10), seq_len(10)]
    as(contacts_yeast, 'data.frame')[seq_len(10), seq_len(10)]

    It might be best to pick a few important commands, rather than showing every possible one in this example which is an overwhelming amount of output to see all at once. Based on my exeperience in designing workshops, beginners tend to copy and paste everything. Break it down into chunks for them.

Thank you for the great suggestion. I have now separated this set of commands in organized subexamples, and it makes a lot more sense.

  • Vignette needs an introduction and more details for novices. Installation instructions are also missing. Please see Vignette Introduction and Installation sections of the contributor's guide.

    ... introduce the objective, models, unique functions, key points, etc that distinguish the package from other packages in the same area.

    For example, the vignette mentions that the package supports three different file types; Cool files, HiC files, and HiC-Pro files. What are the advantages and disadvantages of each file type? Do they all store the same data or do any of them store more details than the others? This would be valuable information for beginners. Also, the introduction mentions "... analytical and visualization tools to investigate contact maps ..." but analytical and visualization words are never mentioned in the vignette again. Does HiCExperiment provide that functionality or does one of the accessory packages, such as HiContacts, provide that? Perhaps this sentence should be changed to provide cross-links to your other packages to guide the end-user to the functionality which is perhaps no longer found in HiCExperiment.

I have added a Introduction and Installation section in the vignette, and removed the paragraph you mentioned. You were correct thinking that this bit (analytical and visualisation) has been migrated to HiContacts.

  • Some cross-links could be added to Rd files. For example, "The ContactsFile class describes a BiocFile object" could use [BiocIO::BiocFile] Roxygen macro to link to the HTML documentation of BiocFile for browsing in RStudio I.D.E.

Thanks for this suggestion. I have added some cross-references to HiCExperiment.R and *File.R doc sections, although I am not really familiar with them. Hopefully they are ok. Let me know if you wish me to add more of them.

DarioS commented 1 year ago

All issues have been satisfactorily dealt with.

bioc-issue-bot commented 1 year ago

Your package has been accepted. It will be added to the Bioconductor nightly builds.

Thank you for contributing to Bioconductor!

Reviewers for Bioconductor packages are volunteers from the Bioconductor community. If you are interested in becoming a Bioconductor package reviewer, please see Reviewers Expectations.

js2264 commented 1 year ago

Hi @DarioS, Thank you for your feedback and your review. Glad to hear that HiCExperiment is accepted. Does this also include the additional packages I submitted in this issue (HiCool and fourDNData)? Thanks in advance, J

lshep commented 1 year ago

I'll let @DarioS comment on if the other two packages were reviewed but from the admin stand point, they have not yet been added to git or had a bioconductor builder report so I added the necessary tags back for processing.

js2264 commented 1 year ago

Hi @lshep, thank your for your reply. Regarding the absence of builder report, I think this is because the 2 additional packages have not been assigned a reviewer yet, so builds were not triggered. Would there be anything to do (assigning a reviewer? Manually triggering builds?) regarding this? Thanks for your input, J

lshep commented 1 year ago

I will be processing packages later today and tomorrow; we will have a quick glance and then mark them for moderation.

vjcitn commented 1 year ago

In checking HiCool i hit

Loading required package: HiCExperiment
Error: package or namespace load failed for 'HiCool':
 object 'GenomicRanges' is not exported by 'namespace:GenomicRanges'
Execution halted

and indeed I see this in the NAMESPACE. Why?

js2264 commented 1 year ago

@vjcitn This is because I've messed up, it should be #' @importFrom GenomicRanges GRanges in R/loops.R. I fixed it in ceb5891.

DarioS commented 1 year ago

Ah, I did not review two associated packages. I thought the lack of build reports indicated they might each get their own issues.

lshep commented 1 year ago

We hold additionalpackages to have a glance to make sure like with incoming packages they look like packages, are in decent enough shape to have a review, and aren't doing anything malicious.

bioc-issue-bot commented 1 year ago

Additional Package has been approved for building.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.

lshep commented 1 year ago

Running the two additional packages now for build reports.

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/HiCool to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Additional Package has been approved for building.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/fourDNData to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 571d2dbce94985799b49519b8c78017722f9e1d7

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/fourDNData to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 55ab37934201fe86518b2561a55ebe9f3732f3a1

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/fourDNData to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 0f9f3eb53ad1705af7cca934d10bfa951308fd96