nipalsMCIA - Githubissues

Muunraker commented 1 year ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Repository: https://github.com/Muunraker/nipalsMCIA

Confirm the following by editing each check box to '[x]'

[x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[x] I understand Bioconductor Package Naming Policy and acknowledge Bioconductor may retain use of package name.
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.
[x] I am familiar with the Bioconductor code of conduct and agree to abide by it.

I am familiar with the essential aspects of Bioconductor software management, including:

[x] The 'devel' branch for new packages and features.
[x] The stable 'release' branch, made available every six months, for bug fixes.
[x] Bioconductor version control using Git (optionally via GitHub).

For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.

bioc-issue-bot commented 1 year ago

Hi @Muunraker

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: nipalsMCIA
Version: 0.99.0
Date: 2022-06-04
Title: Multiple Co-Inertia Analysis via the NIPALS Method
Authors@R: c(
    person("Maximilian", "Mattessich", email = "maximilian.mattessich@northwestern.edu", role = "cre",comment = c(ORCID = "0000-0002-1233-1240")),
    person("Anna", "Konstorum", email = "konstorum.anna@gmail.com", role = "aut", comment = c(ORCID = "0000-0003-4024-2058")),
    person("Joaquin", "Reyna", email = "joreyna@live.com", role = "aut", comment = c(ORCID = "0000-0002-8468-2840")),
    person("Edel", "Aron", email = "edel.aron@yale.edu", role = "aut", comment = c(ORCID = "0000-0002-8683-4772")))
Description: Computes Multiple Co-Inertia Analysis (MCIA), a dimensionality reduction (jDR) algorithm, for a multi-block dataset using a modification to the Nonlinear Iterative Partial Least Squares method (NIPALS) proposed in (Hanafi et. al, 2010).  Allows multiple options for row- and table-level preprocessing, and speeds up computation of variance explained.  Vignettes detail application to bulk- and single cell- multi-omics studies.
License: GPL-3
RoxygenNote: 7.2.3
Depends:
    R (>= 3.5.0)
Imports:
    ComplexHeatmap,
    dplyr,
    fgsea,
    ggplot2,
    graphics,
    grid,
    methods,
    pracma,
    rlang,
    RSpectra,
    scales,
    stats
Url: https://github.com/Muunraker/nipalsMCIA
BugReports: https://github.com/Muunraker/nipalsMCIA 
Encoding: UTF-8
Suggests:
    circlize,
    ggpubr,
    KernSmooth, 
    knitr,
    piggyback,
    reshape2,
    rmarkdown,
    rpart,
    stringr,
    survival,
    Seurat,
    tidyverse,
    spatstat.explore
VignetteBuilder: knitr
biocViews: Software, Clustering, Classification, MultipleComparison, Normalization, Preprocessing, SingleCell

vjcitn commented 1 year ago

Can MultiAssayExperiment class be used to organize your data, instead of a list?

Muunraker commented 1 year ago

Hi Vince, I think we could convert our decomposition function to take a MultiAssayExperiment as an input. However it might involve substantially modifying our vignette code and some downstream functions. Is this something we need to add right away or could we potentially include it in a subsequent version?

lgatto commented 1 year ago

Just quickly chiming in here... I would be interested in using/testing this package on data we use in the lab using, as pointed out by @vjcitn, MultiAssayExperiment objects (or derivatives thereof). Happy to help out if needed.

vjcitn commented 1 year ago

@Muunraker I'd suggest you try it. Making a MultiAssayExperiment should not be hard. Extracting the relevant lists likewise. It is there to help you keep things organized and systematic. Get back to me in a few days with some reactions and we will take it from there.

Muunraker commented 1 year ago

@vjcitn we are close to finishing the MultiAssayExperiment implementation and expect to push an update to the master branch in the next day or two. Do we need to make a new submission or do anything else after we push the update?

vjcitn commented 1 year ago

be sure to bump version number

Muunraker commented 1 year ago

@lgatto any help testing would be greatly appreciated, thank you! We currently have a version with MultiAssayExperiment compatibility on our code development branch, although we are still working through some bugs. We hope to fix these in the next day or two and then it should be ready for testing with MAE objects.

lshep commented 1 year ago

I haven't processed this to be on our git.bioconductor.org system yet. I can wait until you have it on your github branch and then proceed; no need for a new submission. It will then be assigned a reviewer for a more thorough review; from that point on you would need to also push the changes to our git.bioconductor.org system with a valid version bump. Please tag me when the changes are pushed

bioc-issue-bot commented 1 year ago

A reviewer has been assigned to your package. Learn what to expect during the review process.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/nipalsMCIA to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 9e5785f03ee8112de4743b408805c326cb280c3a

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/nipalsMCIA to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

Muunraker commented 1 year ago

Hi @vjcitn @lgatto, we have been working through the errors with R CMD build in our most recent version. We are having some difficulty understanding the "undefined exports" error in MacOS on the most recent build - specifically that these all seem to be functions from MultiAssayExperiment that are not listed as exports in our NAMESPACE file.

Do you have any insight into what may have caused this? Happy to ask on the dev mailing list if this isn't the place.

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/nipalsMCIA to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/nipalsMCIA to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/nipalsMCIA to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

lshep commented 1 year ago

We apologize for the build reports. We are investigating the issue.

eba28 commented 1 year ago

Great, thanks for letting us know!

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/nipalsMCIA to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

lshep commented 1 year ago

The namespace/build error has been resolved. We apologize for the confusion and inconvenience. Please work on correcting the other ERROR's in the check phase for a more thorough review to occur. Cheers

akonstodata commented 1 year ago

Thank you very much for looking into this. We're working now to address the other errors and will push the changes shortly.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: b1d9f3bfb3806b77e2e5c42bead630026f95347f

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/nipalsMCIA to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

akonstodata commented 1 year ago

Dear @lgatto, @lshep, we have pushed a new update which has addressed the errors on the earlier builds. Thanks in advance for taking a closer look!

lgatto commented 1 year ago

I'll get back to you today.

lgatto commented 1 year ago

Please find my review below - apologies for the delay.

DESCRIPTION file

[ ] Could you update R version dependency from 3.5.0 to 4.2.0.

The NAMESPACE file

[ ] You use snake_case for your function names except for CCpreproc(). Any specific reason? Also, generally, functions start with lowercase while uppercase is reserved for classes and their constructor. Case also applied to NIPALS_iter().

Package data

[ ] Why are you using a list for the data in data/ rather than a MAE? This is especially confusing as the data.frame elements have the samples along the rows, rather than the transposed form, that is common in Bioconductor objects?

Package data

I see in vignette 2 that you have external data and use the piggyback package. I am not familiar with the package - it certainly looks interesting, but why don't you make use of the Bioconductor data infrastructure for your package? You do refer to the TENxPBMCData package later.
Also, in vignette 2, your have a code chunk that uses pb_list() without mentioning what this function does, where it comes from, and the chunk that loads the packages (including piggyback) isn't included, and hence it is difficult for the user to follow and impossible to run the code.
Have you considered using BiocFileCache to cache the downloaded data, and of create a dedicated data package?

Documentation

[ ] We recomment that vignette use the BiocStyle package to keep a consisten formatting among Bioconductor packages.
[ ] References to the methods used as well as to other similar or related projects and packages is also expected. I can think for example of the mixOmics package and MOFA2 packages, but there are certainly others.
[ ] The vignette needs an Installation section.
[ ] Could you pleas add a package man page, so that as a new user, I can get started with ?nipalsMCIA. That page could briefly point me to the main function(s) and vignettes of your package.

Vignette 1

Rephrasing "downstream analyses that [help can] interpret the MCIA decomposition."
In the third paragraph, you mention the nipals_multiblock() fonction saying it also supports MAE, but you haven't introduced it. Is this the main function of the package? Could you explain that is does what you explain in the second paragraph, and them move on with the multi-block data types you support.
The names of the mcia_results and their explanations don't match - for example, you mention preproc_method or metadata in the text, but these aren't among the names.
When running the GSEA analyses, could you set message=FALSE to avoid having the progress bars clutter your vignette?

Vignette 2

Please make sure your vignettes don't generate any data other than in a temp directory, which looks like it's the case with path_data <- file.path("..", "data").
See the point above regarding data source for this vignette.

Unit tests

[ ] Your unit tests cover just over 25% or your code base, with more files not covered at all. Would it be possible to extend that coverage?

R code

[ ] I have touched on this above, but what is your rational not to focus on MAE? Using a list, there's no guarantee that the dimensions of the blocks are correct, or that the metadata matches the data. My suggestion would be to aim for MAE and document or provide a helper function to create an MAE from a list of blocks.
[ ] The output of nipals_multiblock() is a list of 9 elements, with some elements being lists themselves. Have you considered defining a class (for example NipalsMultibloc) that holds these together and displays a useful summary?
[ ] Having a class would also address another issue, that is that the same function is used to perform the computation, and to do the visualisation. That means that I would have to repeat the analysis to produce the figures. We typically want to separate these two operations. You could thus have nipals_multiblock() that generates an object of class NipalsMultibloc with all the results, and you could then use plot() (and others) on that object, along the lines of what you already do for projection_plot().

Misc

tests/testthat/helper.R is empty
What is the vignette in extra-tests?
In you roxygen headers, you use \itemize and \item, but you could also use the standard markdown - if you prefer.

akonstodata commented 1 year ago

Thank you @lgatto for the comprehensive review! We're working through it now, and will push updates/respond shortly.

lgatto commented 1 year ago

Don't hesitate if anything isn't clear and/or if you would like some help with anything.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: ea78dc4afadf29cd65dd1dee8cbbcf61e09db51c

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/nipalsMCIA to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: d4e67bc7154b5b1f35d9189f8b95f351291f0341

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/nipalsMCIA to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

akonstodata commented 1 year ago

@lgatto, @lshep, we have pushed a new version that addresses the review comments by @lgatto, but are experiencing a build error that we do not see on our local machines. Would you mind taking a look? We'll communicate our point-by-point response in this thread as well after we address this error. Thank you in advance for your help.

lshep commented 1 year ago

This is on our side and announced on the bioc-devel mailing list https://stat.ethz.ch/pipermail/bioc-devel/2023-June/019724.html

akonstodata commented 1 year ago

@lshep thank you for your quick reply and for the information. I'm now subscribed to the mailing list, so should be receiving updates on this and other issues going forward.

akonstodata commented 1 year ago

@lgatto, we wanted to thank you for the comprehensive review! Please see our point-by-point response below.

DESCRIPTION file

[x] Could you update R version dependency from 3.5.0 to 4.2.0.

We have updated our R dependency to 4.2.0.

The NAMESPACE file

[x] You use snake_case for your function names except for CCpreproc(). Any specific reason? Also, generally, functions start with lowercase while uppercase is reserved for classes and their constructor. Case also applied to NIPALS_iter().

We have renamed all the functions and files that were using snake_case (get_TV to get_tv, NIPALS_iter to nipals_iter and CCpreproc to cc_preproc) and updated all of the corresponding documentation and unit tests. The only remaining non-snake_case file is metadata_NCI60.R since NCI60 is the proper name of the dataset.

Package data

[ ] Why are you using a list for the data in data/ rather than a MAE? This is especially confusing as the data.frame elements have the samples along the rows, rather than the transposed form, that is common in Bioconductor objects?

We would like to make MCIA readily available for users that may or may not be familiar with MAE objects. Ideally, users who do not regularly store their data in MAE objects should be able to use the package without additional overhead. One goal of our algorithm (which we discuss in a forthcoming manuscript) is to help bridge use of MCIA by bioinformaticians and its development by applied mathematicians. Since the latter group may comprise users with less familiarity with bioinformatics tools, we want to ensure a relatively low barrier of entry for use of the software for all groups.

The data.frame elements list samples along the rows since that is the appropriate input orientation for the underlying mathematical framework that the nipals_multiblock() algorithm is built on, but since this may generate confusion, we have added a parameter to the nipals_multiblock() function that specifies whether the samples are in the rows or the columns (row_var = 'sample' is default, but can be changed to row_var = 'feature'), thus allowing users to choose their preferred orientation.

Package data

I see in vignette 2 that you have external data and use the piggyback package. I am not familiar with the package - it certainly looks interesting, but why don't you make use of the Bioconductor data infrastructure for your package? You do refer to the TENxPBMCData package later.

Our aim in Vignette 2 is to have it serve as start-to-finish example for users working with single cell data (in addition to showing how nipalsMCIA can be run on that kind of data). As such, we decided it would be valuable to demonstrate how to download the standard output from 10x Genomics and read it in for analysis purposes. That way, users who have their own raw data can follow along with the example. If they have their data in an SCE object already, we show how to work with it by using the TENxPBMCData package, which also comes with the bonus of demonstrating how to convert from a SCE object to a Seurat object. Using piggyback allows users to make a variety of different data objects available that can be examined and loaded in as needed depending on which stage of the process they are (as shown in the workflow at the top of the vignette).

Also, in vignette 2, your have a code chunk that uses pb_list() without mentioning what this function does, where it comes from, and the chunk that loads the packages (including piggyback) isn't included, and hence it is difficult for the user to follow and impossible to run the code.

We describe the pb_list() function in the comment above that code chunk. We’ve added piggyback:: to all of the calls that use functions from that package and made the chunk that shows the loaded packages visible when rendered across all of the vignettes.

Have you considered using BiocFileCache to cache the downloaded data, and of create a dedicated data package?

We have considered using BiocFileCache, but would prefer to leave the decision to the users whether to have a cache or not. The data we include is just below 1GB in size and we don’t anticipate that it will change for the purposes of the vignette or that users will be using it directly (we expect its main usage to be limited to the vignette). Using BiocFileCache would add an extra layer of complexity in addition to piggyback, which downloads the files to a temporary directory (which can be made permanent if the user wishes) directly without having to interact with a Cache object, and syncs with each release.

We do not feel that it is appropriate in our case to create a dedicated data package since the data we use in our vignettes is not our data, and there are already several existing Bioconductor packages that make public single cell data (including ours) available. As we’ve discussed above, we chose our mode of data access primarily to demonstrate a single-cell analysis pipeline for users of nipalsMCIA.

Documentation

[x] We recomment that vignette use the BiocStyle package to keep a consisten formatting among Bioconductor packages.

We have updated the vignettes using the BiocStyle package instead of rmdformats, with some modifications made to how they were being rendered.

[x] References to the methods used as well as to other similar or related projects and packages is also expected. I can think for example of the mixOmics package and MOFA2 packages, but there are certainly others.

We have added a Motivation subsection to our Introduction in Vignette 1. It includes a brief description of the methodology, provides motivation for the new package, and references two Bioconductor packages that also perform MCIA. We chose not to reference additional packages available in Bioconductor since they are too numerous to list, but we added a reference (Cantini, 2021) that performs a benchmarking analysis of several of these methods.

[x] The vignette needs an Installation section.

[x] Could you pleas add a package man page, so that as a new user, I can get started with ?nipalsMCIA. That page could briefly point me to the main function(s) and vignettes of your package.

We have added Installation subsections to all of the vignettes (within the Introduction sections). These sections include how to currently install the package off of GitHub as well as the installation command for if our package is accepted to Bioconductor. There is also now a man page which can be accessed with ?nipalsMCIA.

Vignette 1

Rephrasing "downstream analyses that [help can] interpret the MCIA decomposition."

Done!

In the third paragraph, you mention the nipals_multiblock() fonction saying it also supports MAE, but you haven't introduced it. Is this the main function of the package? Could you explain that is does what you explain in the second paragraph, and them move on with the multi-block data types you support.

Indeed, the nipals_multiblock() function is the main function of the package. We have provided a short description of it in the Overview subsection of the Introduction, prior to discussing its support for MAE.

The names of the mcia_results and their explanations don't match - for example, you mention preproc_method or metadata in the text, but these aren't among the names.

We apologize for the confusion. We had renamed the input argument preproc_method to col_preproc_method without renaming the associated output in nipals_multiblock() (this has now been fixed). The metadata field is also now present in the output independent of whether metadata is provided in the input. The names and explanations now fully match.

When running the GSEA analyses, could you set message=FALSE to avoid having the progress bars clutter your vignette?

Done! We applied the parameter results = ’hide’ which still shows the code but not the progress bars.

Vignette 2

Please make sure your vignettes don't generate any data other than in a temp directory, which looks like it's the case with path_data <- file.path("..", "data").

See the point above regarding data source for this vignette.

The vignette does not actually use path_data to generate or save any other data. That is provided as an example of how a user could save files locally, and all code using it is commented out.

Unit tests

[x] Your unit tests cover just over 25% or your code base, with more files not covered at all. Would it be possible to extend that coverage?

We have added significantly more tests, with our coverage now reaching ~81%.

R code

[ ] I have touched on this above, but what is your rational not to focus on MAE? Using a list, there's no guarantee that the dimensions of the blocks are correct, or that the metadata matches the data. My suggestion would be to aim for MAE and document or provide a helper function to create an MAE from a list of blocks.

Please see our response in the Package data section above. We have an internal check in nipals_multiblock() that the sample annotation for each block is identical, and matches that of the metadata, if it is input. An error is returned if this is found to not be the case.

[x] The output of nipals_multiblock() is a list of 9 elements, with some elements being lists themselves. Have you considered defining a class (for example NipalsMultibloc) that holds these together and displays a useful summary?

[x] Having a class would also address another issue, that is that the same function is used to perform the computation, and to do the visualisation. That means that I would have to repeat the analysis to produce the figures. We typically want to separate these two operations. You could thus have nipals_multiblock() that generates an object of class NipalsMultibloc with all the results, and you could then use plot() (and others) on that object, along the lines of what you already do for projection_plot().

We have considered defining a class for the object and decided at present that it would not significantly improve the utility of the downstream functions and application of the nipals_multiblock() output. Currently, defining a class would require us to rewrite much of our base code, which we want to consider seriously before making this update. The nipals_multiblock() function provides a summary figure for a quick visualization of the results, but all plotting functions (for example, projection_plot(), global_scores_heatmap(), vis_load_plot(), ...) take in a nipals_multiblock object; thus one need only run nipals_multiblock() once. nipals_multiblock() calls two of the aforementioned functions for the summary plots (which can be turned off using the plots = none option). The use of all plotting functions is detailed in Vignette 1.

Misc

tests/testthat/helper.R is empty

We originally created a helper.R file based on documentation that suggested that it should be used for storing common code used across testing files, but when we tried it we found that the tests only properly executed by using a setup.R file instead. The helper.R file has now been removed.

What is the vignette in extra-tests?

It has now been removed.

In you roxygen headers, you use \itemize and \item, but you could also use the standard markdown - if you prefer.

Good to know, thank you!

Muunraker commented 1 year ago

@lshep I noticed we still haven’t gotten a re-build - is the bug with biocCheck fixed? If so, should I push again with a bumped version number? Apologies if I missed any updates on bioc-devel!

lshep commented 1 year ago

Yes the BiocCheck issue is fixed. If you push again with a version bump it should trigger the build

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 08e11e47d9ecc2b5de2d57c03ab72d245615ffe5

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/nipalsMCIA to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

akonstodata commented 1 year ago

@lgatto, we are curious if you have availability to look at the revised version of our software in the next few weeks? We appreciate your help!

lgatto commented 1 year ago

Now that we are in the R-devel cycle, could you update the R version dependency from 4.2.0 to 4.3.0.

MultiAssayExperiment

Regarding the use of MAE, it is a central principle for the Bioconductor project to make use of its core classes; in this particular case, the natural data structure would be to use a MAE of SummarizedExperiment instances, or a least of matrices of dimensions features by samples. Your approach of a list of data.frames using feature along the rows violates what a typical Bioconductor user would expect. I understand that other communities might be used to other conventions, but a Bioconductor package should follow the project's guidelines.

Here's how you would get an annotated MAE of matrices:

mae1 <- MultiAssayExperiment(
   lapply(data_blocks, function(x) t(as.matrix(x))),
   colData = metadata_NCI60)

Or better an annotated MAE of SEs:

mae2 <- MultiAssayExperiment(
   lapply(data_blocks, function(x) SummarizedExperiment(t(as.matrix(x)))),
   colData = metadata_NCI60)

The latter allows you to extract any assay with its annotation using getWithColData(mae2, 1),

If I want to use a subset of samples (for example drop CNS cancer type) and/or use a subset of the assays:

> mae2[, mae2$cancerType != "CNS", 1:2]
harmonizing input:
  removing 15 sampleMap rows not in names(experiments)
A MultiAssayExperiment object of 2 listed
 experiments with user-defined names and respective classes.
 Containing an ExperimentList class object of length 2:
 [1] mrna: SummarizedExperiment with 12895 rows and 15 columns
 [2] miRNA: SummarizedExperiment with 537 rows and 15 columns
Functionality:
 experiments() - obtain the ExperimentList instance
 colData() - the primary/phenotype DataFrame
 sampleMap() - the sample coordination DataFrame
 `$`, `[`, `[[` - extract colData columns, subset, or experiment
 *Format() - convert into a long or wide DataFrame
 assays() - convert ExperimentList to a SimpleList of matrices
 exportClass() - save data to flat files
Warning message:
'experiments' dropped; see 'metadata'

You can always get a list of experiments (matrices or SEs) with

experiments(mae1)
experiments(mae2)

Internally, you would then extract the individual assays with experiments() as shown above and work as you do now, transposing if that's the shape you need. The default for a Bioconductor package should really be nipals_multiblock(..., row_var = 'feature').

One important feature of using MAE, that is also core to Bioconductor, is to store the data and metadata together, in a single object. The validity of both then comes for free.

Finally, you could even use more specialised classes within your MAE, such as SingleCellExperiment for single-cell data. This would allow your users to benefit from all the existing tools for downstream analyses.

Package data

Your points about having a start-to-finish vignette are great, and would still be valid using dedicated Bioconductor infrastructure.

The recommended Bioconductor way to transparently handle data would be to use BiocFileCache, but this points isn't as central as using apropriate classes - piggyback (a nice package, by the way) will do.

R code

Regarding using a class for to store and manage the nipals_multiblock() output, you write:

at present that it would not significantly improve the utility of the downstream functions

What you fail to see here is that you gain much by doing it now: if not now, you will impose a major API change to your users. I won't insist on a class, even though as a user, I would definitely prefer it.

akonstodata commented 1 year ago

Thank you very much for your review and comments @lgatto. We will update the software based on your recommendations. We may be a bit delayed as a few team members are on break - my aim is to push a new version by mid-late August. I will reach out if this timeline changes.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 193af0308eb378e9ba66d5d77017fd2330fffb01

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Build System.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Bioconductor Build System: macOS 12.6.5 Monterey: nipalsMCIA_0.99.6.tar.gz Linux (Ubuntu 22.04.2 LTS): nipalsMCIA_0.99.6.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/nipalsMCIA to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

akonstodata commented 1 year ago

Dear @lgatto, we have just pushed a new version with the following updates:

The input to nipals_multiblock() is now required to be an object of the MAE class.
- For users with 'simple' data lists where sample names are matched (as we had in our original list), we provide a function simple_MAE() that can transform the list into an object of the MAE class similar to what you showed in the example, and can take in input of data lists of either orientation.
- If users have more complex data types, we refer them to the MAE documentation to generate MAE objects for use with nipalsMCIA.
- In Vignette 1, we provide more background details on MAE objects, as well as show examples on how to create them using the simple_MAE() function with and without associated sample metadata.
- In Vignette 2, we provide an example of where an MAE object holding experiments of the SingleCellExperiment class is used as input to nipals_multiblock().
nipals_multiblock() now returns an object of the NipalsResult class. We have refactored the functions that use as input objects from nipals_multiblock() to accept objects of this class. We provide details of the class outputs in Vignette 1.
We have updated our R version dependency to 4.3.0.

Thank you as always for your helpful comments and review of nipalsMCIA.

lgatto commented 1 year ago

Thank you very much for this, @akonstodata and for contributing your package to Bioconductor. I'm happy to accept it.

bioc-issue-bot commented 1 year ago

Your package has been accepted. It will be added to the Bioconductor nightly builds.

Thank you for contributing to Bioconductor!

Reviewers for Bioconductor packages are volunteers from the Bioconductor community. If you are interested in becoming a Bioconductor package reviewer, please see Reviewers Expectations.

akonstodata commented 1 year ago

Thank you @lgatto! We are very happy to hear that our package will be a part of the Bioconductor suite. We want to thank you again for your helpful feedback and review.

Bioconductor / Contributions

nipalsMCIA #2914

DESCRIPTION file

The NAMESPACE file

Package data

Package data

Documentation

Vignette 1

Vignette 2

Unit tests

R code

Misc

DESCRIPTION file

The NAMESPACE file

Package data

Package data

Documentation

Vignette 1

Vignette 2

Unit tests

R code

Misc

MultiAssayExperiment

Package data

R code