Bioconductor / Contributions

Contribute Packages to Bioconductor
135 stars 33 forks source link

mCSEAdata #552

Closed jordimartorell closed 6 years ago

jordimartorell commented 7 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.

bioc-issue-bot commented 7 years ago

Hi @jordimartorell

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: mCSEAdata
Type: Package
Title: Data package for mCSEA package
Version: 0.99.0
Author: Jordi Martorell Marugán
Maintainer: Jordi Martorell Marugán <jmartorellm@gmail.com>
Description: Data objects necessary to some mCSEA package functions.
    There are also example data objects to illustrate mCSEA package functionality.
Depends: R (>= 3.4.1)
Suggests: BiocStyle, knitr, rmarkdown
VignetteBuilder: knitr
biocViews: Homo_sapiens_Data, MethylationArrayData, MicroarrayData, ExperimentData
License: GPL-2
Encoding: UTF-8
LazyData: true
bioc-issue-bot commented 7 years ago

Your package has been approved for building. Your package is now submitted to our queue.

IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your repository will NOT trigger a new build.

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/mCSEAdata_buildreport_20171113062920.html

lshep commented 7 years ago

I'm rerunning the build check so there is a report using R 3.5 and Bioc 3.7 (the builders were updated today); I should have a review within the next few days. Cheers

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/mCSEAdata_buildreport_20171114104501.html

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

0bf8ba6 Update DESCRIPTION

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/mCSEAdata_buildreport_20171116064652.html

jordimartorell commented 7 years ago

AdditionalPackage: https://github.com/jordimartorell/mCSEA

bioc-issue-bot commented 7 years ago

Hi @jordimartorell,

Starting build on additional package https://github.com/jordimartorell/mCSEA.

IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your additional package repository will NOT trigger a new build.

The DESCRIPTION file of this additional package is:

Package: mCSEA
Type: Package
Title: Methylated CpGs Set Enrichment Analysis
Version: 0.99.0
Author: Jordi Martorell Marugán
Maintainer: Jordi Martorell Marugán <jmartorellm@gmail.com>
Description: Identification of diferentially methylated regions (DMRs) in predefined regions
(promoters, CpG islands...) from the human genome using Illumina's 450K or EPIC microarray data.
Provides methods to rank CpG probes based on linear models and includes plotting functions.
Depends: R (>= 3.5), mCSEAdata
Suggests: BiocStyle, knitr, rmarkdown
Imports: limma, fgsea, Gviz, ggplot2, minfi, grDevices, stats, GenomicRanges
VignetteBuilder: knitr
biocViews: DifferentialMethylation, GeneExpression, Microarray,
    MethylationArray, Genetics,
    GenomeAnnotation, DNAMethylation, TwoChannel,
    MultipleComparison
License: GPL-2
Encoding: UTF-8
LazyData: true
bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "TIMEOUT". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/mCSEA_buildreport_20171116083504.html

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

305cd2f Update DESCRIPTION

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "TIMEOUT". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/mCSEA_buildreport_20171116095026.html

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

c892f5e Update DESCRIPTION

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/mCSEA_buildreport_20171116111105.html

lshep commented 6 years ago

Thank you for your submission to Bioconductor. Please see some comments below

Data Package Could you please be a little more specific on how the data was constructed from the annotation data?

Software Package

Vignette:

Man: mCSEA-package and mCSEATest use myResults[["promoters"]] over myResults$promoters

R Code: rankProbes:

mCSEATest:

mCSEAPlot:

Please clarify or address the above issues and comment back when you have updates and are ready for a second review.

Thank you for your submission to Bioconductor.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

ab7070f Update DESCRIPTION

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/mCSEAdata_buildreport_20171123104013.html

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

3c45f1a New push

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/mCSEA_buildreport_20171123110315.html

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

865c05d Update DESCRIPTION

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/mCSEA_buildreport_20171124071243.html

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

157798a Update DESCRIPTION

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/mCSEA_buildreport_20171124074201.html

jordimartorell commented 6 years ago

Hi Ishep, thanks for your really helpful comments. I modified mCSEA following your suggestions. Here I'll talk about each of your points:

Data Package Could you please be a little more specific on how the data was constructed from the annotation data?

I extended the explanation in the vignettes (Sources section). I hope it is more clear now.

Software Package

Can this be integrated with COHCAP?

Yes, you can get a beta-values matrix with COHCAP.annotate() function and use this object as input of rankProbes() function (data parameter). However, users should modify the COHCAP output to fit the rankProbes() format requirements. Users could also use COHCAP.site() output instead to get some metric (e.g. delta beta, pvalue...) and use it to directly run mCSEATest function skipping rankProbes() step. Users should modify COHCAP output too to satisfy the mCSEATest() input format. Regarding to the analysis itself, COHCAP and mCSEA use very different approaches, so I think that both analysis are complementary, but integrating COHCAP analysis in mCSEA would be quite confusing for the users.

Please add unit tests

I added unit tests for rankProbes() and mCSEATest() functions. I don't test mCSEAPlot() and mCSEAPlotGSEA functions because they return a graphical output.

Input and Output questions: betaTest is a matrix of probes x samples - can this also be a summarizedExperiment? custom association object between CpG probes and regions - could this be a GeneSet object?

Now, rankProbes() function accepts a SummarizedExperiment object. Regarding custom association object, I think that the use of a GeneSet (or GeneSetCollection) object would be very confusing due to the gene sets should be the genes itselves, and the genes should be the CpG probes. In addition, to my knowledge there are no methods to easily prepare such particular GeneSet object, so users should make an extra effort to prepare this object. I think that it's easier for users to prepare a list with the required format for the custom association. However, if you think that I should implement GeneSet compatibility despite what I said, I have no problem with trying it.

Vignette:

We don't encourage direct use of $ for example myResults$promoters[,-7] could be writted as myResults[["promoters"]][,-7]
Describe what is in each output object of list? what is stored in myResults[["promoters"]] and myResults[["promoters_associations"]]
mCSEAPlot takes a long time! Reevaluate and try to speed up or at least have some output or progress bar that it is calculating.

I fixed the use of $ and I better explained the contents of the mCSEATest output objects. mCSEAPlot is slow the first time is executed due to it connects to online servers in order to get genomic information. I improved one of the steps to use cached data, saving some time. Due to it is still slow, I added a progress bar too.

Man: mCSEA-package and mCSEATest use myResults[["promoters"]] over myResults$promoters

Done.

R Code: rankProbes:

Are the outpus of minfi and ChAMP matricies? It seems like minfi has output in a particular class - your function should work with this output.

Althought minfi uses some special classes (e.g. MethylSet), it includes functions to get a matrix of beta-values (getBeta()) or M-values (getM()). ChAMP output class depends on the type of analysis performed. For instance, champ.norm() function returns a matrix, while champ.load() returns a list of results, and one of them is a beta-values matrix. So rankProbes() function is totally compatible with minfi and ChAMP outputs as long as a matrix with the methylation values is obtained.

mCSEATest:

why can't users input their own assocPromoters/assocGenes/assocCGI? what is the purpose of using your data?

Users can use their own association objects, that's the purpose of the "custom" parameter. I included assocPromoters/assocGenes/assocCGI because these are typical regions to search for DMRs, so users can use these prebuilt associations, saving them the creation of the association object. Of course, users can choose not using these objects (regionsTypes = NULL).

Don't repeat code - you essentially do the same chunk of code based on what is defined in region or customAnnotation - this code should be a separate function and pass in appropriate argument for calculation

I implemented that code in the internal function .performGSEA()

why is there a set.seed?

fgsea() function performs permutations, so the randomness cause that the calculated P-values are not exactly the same each time the function is run. I added a set.seed() to avoid that behaviour.

you said users can use their own with model of assocGenes450k should also be able to work with GeneSets

I talked about GeneSets objects in a previous comment.

mCSEAPlot:

need some status message or something that it is working

I added a progress bar.

Don't repeat code - if you use [[ instead of $ don't need if/else/if

Fixed.

lshep commented 6 years ago

Sorry for the delayed response; I will try to be better about responding more timely. Thank you for your updates. A few more minor points below:

vignette:

Code:

rankProbes.R / documentation

mCSEATest.R / documentation

mCSEAPlot.R / documentation

mCSEAPlotGSEA.R / documentation

Please address these issues and I will take another look. Cheers,

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

b28e8bc Version bump

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

a26bdae Version bump

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/mCSEAdata_buildreport_20171219105718.html

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/mCSEAdata_buildreport_20171219105750.html

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

1883ef4 Update DESCRIPTION

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/mCSEAdata_buildreport_20171219111842.html

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

10f9fcd Version bump

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/mCSEA_buildreport_20171219112841.html

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

8bef882 Version bump

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/mCSEA_buildreport_20171220115047.html

jordimartorell commented 6 years ago

Hi Ishep. Thank you again for your helpful comments. Here I adress each of the points:

vignette: show what to convert from minfi or ChAMP to use as examples since you mention them at the top (essentially what you explained to me here would be fine)

Added.

still takes a long time (assumingly slow the first time is executed due to it connects to online servers in order to get genomic information) where progress bar is not active- maybe make note of this in the vignette that it could take some time to generate so users are aware

Done.

Code: What is sysdata.rda in R directory? Seems like this should be deleted or at the very least moved to a data directory and documented. (then ::: call to it in code? - if in data directory change to load)

Sorry for this, I read somewhere that internal data which it's not nececessary to be accesible for the users should be stored in sysdata.rda and undocumented. This file only contains the bandTable object, which is a downloaded file with chromosomic annotation to avoid the downloading of such file each time mCSEAPlot() is executed. I moved this object to data folder, I documented it properly and I changed the way mCSEAPlot() loads it. I tried to load it with load() function, but several errors occured during the building and checking, so I finally use this object with mCSEA::bandTable instance. I hope this is a correct way to do this.

rankProbes.R / documentation extra 1 typo right at start of file.

Fixed.

should be library(mCSEA) not mCSEAdata in documentation

I tried to change this, but I obtained the following warning in BiocCheck: WARNING: The following files call library or require on mCSEA. This is not necessary. man/rankProbes.Rd. I decided to remove this line.

this seems like a quick calculation - why have the dontrun in this documentation ?

It is indeed a quick calculation, but I thought to use the precomputedmCSEA object to save some seconds during the building and checking. I removed the dontrun.

mCSEATest.R / documentation I would add to regionTypes documentation: NULL to skip this step and use customAnnotation.

Added.

really a set a seed should never be used inside function code - if you would like reproducible in example or vignettes call it outside of function and explain why.

I removed the set.seed() of the code and included outside the function in functions documentations and vignettes. I explained the purpose of including a set.seed() in vignettes.

mCSEAPlot.R / documentation as mentioned plotting is still slow - perhaps add the progress,2 in between the minfi::RatioSet and minfi::getAnnotation, as these steps appear to take awhile

Added.

what is the bandTable mCSEA:::bandTable? Is that something that was downloaded and saved (seems like this is the sysdata.R) - moved to data directory and load

I explained this in a previous point. bandTable object has been moved to data directory, documented and loaded properly.

mCSEAPlotGSEA.R / documentation use [[""]] rather than '$'

Fixed.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

1064a7f New update

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

jordimartorell commented 6 years ago

Hi again, Ishep, sorry for double-posting. I included a new function and some changes:

Of course, manuals, vignettes and unit tests have been updated according to these changes.

Thanks.

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "TIMEOUT". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

lshep commented 6 years ago

Hi There,

Please see the Notes in the build report especially concerning best coding practice and adjust the code accordingly.

While this is the first time you have received a TIMEOUT, it is somewhat problematic. Generally the linux (malbec2) times are used for the requirement - It seems with recent changes even on the linux machine the build time exceed 7 min and the check time exceeds 6 min. In the package guidelines the max for checking is 5 min. Could you please reevaluate the code to try and make it more efficient so that the check time of 5 mins is meet.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

c38c452 Update

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

79457f4 Update

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "TIMEOUT". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

71571da exprTest, and annotation objects added

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

401bf59 Improve efficiency

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

jordimartorell commented 6 years ago

Hi. I realized that the problem was that downloading the arrays annotation data is much more slow in the Bioconductor servers that in my computer, so I stored these annotations in mCSEAdata package (annot450K and annotEPIC). Build and check times are significantly lower now.

Thanks.