Closed na396 closed 1 year ago
Hi @na396
Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.
The DESCRIPTION file for this package is:
Package: SGCP
Type: Package
Title: SGCP: A semi-supervised pipeline for gene clustering using self-training approach in gene co-expression networks
Version: 0.99.0
Authors@R: c(person("Niloofar", "AghaieAbiane", email = "niloofar.abiane@gmail.com" ,role = c("aut", "cre")),
person("Ioannis", "Koutis", email = " ikoutis@njit.edu",role = c("aut")))
Description: SGC is a semi-supervised pipeline for gene clustering in gene co-expression networks.
SGC consists of multiple novel steps that enable the computation of highly enriched modules
in an unsupervised manner. But unlike all existing frameworks, it further incorporates a
novel step that leverages Gene Ontology information in a semi-supervised clustering method
that further improves the quality of the computed modules.
License: GPL-3
Encoding: UTF-8
LazyData: true
Imports: ggplot2, expm, caret, plyr, dplyr, GO.db, annotate, SummarizedExperiment,
genefilter, GOstats, RColorBrewer, xtable, Rgraphviz, reshape2, openxlsx,
ggridges, DescTools, org.Hs.eg.db, methods, grDevices, stats
Suggests: knitr
Depends: R (>= 4.2.0)
biocViews: GeneExpression, GeneSetEnrichment, NetworkEnrichment, SystemsBiology,
Classification, Clustering, DimensionReduction, GraphAndNetwork,
NeuralNetwork, Network, mRNAMicroarray, RNASeq, Visualization
VignetteBuilder: knitr
NeedsCompilation: no
URL: https://github.com/na396/SGC
Date/Publication: 2022-10-06
RoxygenNote: 7.2.1
A reviewer has been assigned to your package. Learn what to expect during the review process.
IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.
Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "TIMEOUT, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/SGCP
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Greetings @jianhong @lshep Thank you for the comment. The timeout problem happens in " creating vignettes", because my package in general takes hours or even days to be completed. This is the nature of my package. The example I provided in the "vignettes" is the smallest data I could show as an example for my package.
Here is the way I wrote the vignettes. I provided a small dataset in the vignettes and then I tried to explain how to use the functions in my package using that dataset. So during this process, in section "creating vignettes", it may take up to 3 hours to be completed. Is there any solution for this scenario? Thank you so much
Tagging: @vjcitn / @hpages for additional thoughts and comments. In generally packages cannot take that long to build on our builders. Packages need to be able to be built daily by our daily builder with a smaller example dataset. Perhaps storing intermittent data objects to load in various steps while make more in depth long tests might be an option. The other option would be to convert it into a workflow package but the timeout limit for a workflow package I believe is 2 hours. @hpages would appreciate input as well.
@lshep I check my code one more time, it takes about 1:00 hour to run. Can you tell me what your recommendation is? Thank you so much, and I apricate your help in advance.
You should have code and "pre-cooked" data that allow the package to build and check in under (20?) minutes. That's good for you and for us -- you can get a meaningful result in 20 minutes -- you will know if something has gone wrong with your use of the ecosystem almost interactively. Then accompany this with a workflow package that can consume an hour of build time but is run infrequently. It would have more realistic computations.
@vjcitn Thank you so much for your comment. I appreciate a lot. This time excess is due to the nature of the algorithm inside package, not the data. Please see this https://arxiv.org/abs/2209.10545. In this package I need to call another library for 11 times in my algorithm, and each time call takes up to 7-8 minutes regardless of the input size, . So from my side, there is no way I could change the algorithm. Is there any solution you recommend?
I can't provide detailed information at this time. Perhaps this will have to wait for inclusion in a future release of Bioconductor. Do the best you can.
@vjcitn Thank you so much. I do appreciate your help. I was wondering if you know the estimated time for Bioconductor release? Or Can I change the package into workflow?
Greeting @vjcitn @lshep I have changed the package, and now it takes roughly 13 minutes to be run. However, I have taken more space, in total less than 5 MB as I need to store some results. All rda files are compressed, and on my local computer I did not have any error and warnings. I pushed it to "git@git.bioconductor.org/SGCP.git". Please let me know if it's fine or I need to do anything. Many thanks for your consideration in advance
Hi @lshep I was wondering if you have seen my previous message?
You would probably want to store the results on the experiment hub to get the package down to a reasonable size. Also then users would only need to store/download the data when they were interested in running your examples rather than all the time.
@lshep Thank you for the message. I have a quick question,. When I was looking at the Bioconductor guidance, I noticed that my package size, which is 3.12 MB, is in acceptable for a Bioconductor. So my question is do I still need to use experiment hub. I also have one more question, is there anything I need to do for further steps? Will my package evaluate for the Bioconductor open source? Thank you so much for your time and consideration
You need to get the package to not TIMEOUT. Please push any changes to see how the package runs on the system. I suggested ExperimentHub; looking back I misread your comment I thought you said in order to get the package to run that you were over the 5 MB limit so no ExperimentHub is not necessary.
@lshep The timeout problem is resolved, and I have pushed the changed. And I this everything is ready.
Please push changes to git.bioconductor.org with a version bump. You need to trigger a new build. See https://github.com/Bioconductor/Contributions/issues/2840#issuecomment-1280774435
Ok, will do soon, thanks
@lshep Sorry for keep asking question. I just checked my package, and noticed that the package directory size is 3.2 MB, while its installed size is 7.1 MB. Do I need to use the ExperimentHub? Thank you in advance
Received a valid push on git.bioconductor.org; starting a build for commit id: 147671fb991e5446858eb113742a0ea1cd693dc5
@lshep Many many thanks, space, and time are resolved. I have bumped the version and pushed the changes. Everything is ready now, please let me know if I need to do any step. Thank you so much
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/SGCP
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: fb737d0753cd7414625e488eda30d7a5e03e07b7
@lshep Pushed another. Thanks
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/SGCP
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: e0f0bd7edeb102c860d3485843c48945817df63d
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/SGCP
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
@lshep Hi, Do I need to do anything at this stage?
Please wait for the reviewer to do an indepth review of the package. This normally occurs with 2-3 weeks of a clean build report.
Thank you for submition your package to Bioconductor. The package passed check and build. It is in pretty good shape. However there are several things need to be fixed. Please try to answer the comments line by line when you are ready for a second review.
Code: Note: please condsider; Important: must be addressed.
importFrom
instead of import all with import
.
@
or slot()
- accessors implemented and used. Please ask help form HyperGResult-accessors
paste
in message()
, message
, stop
::
is not suggested in source code unless you can make sure all the packages are imported. Some people think it is better to keep ::
. However please note that you need to manully double check the import items when you make any change in the DESCRIPTION file during development. My recommendation is to remove one or two repeats to force the dependency check.
for
loops present, try to replace them by *apply
funcitons.
drop=FALSE
to avoid the reduction of dimension for matrices and arrays.
clustering
and cvConductance
clustering
and ezSGCP
clustering
and sigClusGO
and cvConductance
DOM
and TOM
ezSGCP
and geneOntology
ezSGCP
and semiLabeling
ezSGCP
and semiSupervised
GeneOfGOTerm
and GOenrichment
geneOntology
and GOenrichment
df2mat
you already removed the colnames and rownames, but you call remove them again at line 158-159.motivation for submitting to Bioconductor
as part of the abstract/intro of the main vignette.
@jianhong Thank you so much for the comments.
in line 18 import("org.Hs.eg.db") => I need to pass this object to GOstat function in line 19 import("ggplot2") => I have used many functions of ggplot for the aim of visualization. in line 20 import("expm") => I need to import the operation ^ for matrix powering. in line 21 import("dplyr") => I have used plenty functions of dplyr library fir dataframe related tasks. in line 22 import("GO.db") in line 23 import(annotate, except=c(toFile)) in line 24 import("genefilter") in line 25 import("GOstats") in line 26 import("RColorBrewer") in line 27 import("xtable") in line 28 import("Rgraphviz") in line 29 import("reshape2") => fixed in line 30 import("openxlsx") => fixed in line 32 import("caret") => fixed
In general SGCP highly depends on ggplot, dplyr, caret, and GOstats packages "GO.db", "annotate", "RColorBrewer", "genefilter" are the dependencies of GOstats. When I installed the GOstats for myself, the dependencies were not installed. After multiple attempts, I installed the dependencies manually and then GOstats package. And this is the reason I imported these libraries. The remaining are fixed.
NOTE: Consider adding the maintainer's ORCID iD in 'Authors@R' with 'comment=c(ORCID="...")' => Fixed
NOTE: Consider adding unit tests. We strongly encourage them. See https://contributions.bioconductor.org/tests.html => this package works with big data, and its a pipeline for series of step on large dataset. Each step by itself has many parameter that may result in different solutions. Additionally, each step may take up to hours to run that violates the time limit requirement for the Bioconductor. Moreover, each step does not have a deterministic solution. This pipeline has randomness in each step.
NOTE: no direct slot access with @ or slot() - accessors implemented and used. Please ask help form HyperGResult-accessors => I'm not sure if understand it correctly, But, in "GO_Genes <- hg@goDag@nodeData@data'", hg is an object returned by hyperGTest function in GOstats package, and at this stage, SGCP try to retrieve some information from this object. Please guide me if I need to change it.
important: No paste in message(), message, stop => the first two are fixed. For caption_sym <- paste0(" output of ", stp, " , is not symmetric")' I use it in the next syntax which is stop(caption_sym). I used the paste command, because this function is for error detection and is used in multiple stage, with paste function I can make dynamic that the stop syntax tells me where the error has happened.
NOTE: :: is not suggested in source code unless you can make sure all the packages are imported. => Fixed
NOTE: Vectorize: for loops present, try to replace them by *apply functions. => for loops does not have a regular pattern or structure, depends on the cluster size and shape, it may be different . In side each iteration, many steps are taken and none of these has a regular structure. Throughout this package, everything is implemented vectorized except these three loops that I was not able to come up with vectorized implementation.
Important: Remove unused code. => Fixed
NOTE: Avoid using '=' for assignment and use '<-' instead => Fixed.
Important: Please consider to add drop=FALSE to avoid the reduction of dimension for matrices and arrays. => The pipeline at these stages, actually, needs to reduce the dimension. This is the target of these steps.
NOTE: Functional programming: code repetition. => Although it seems that these syntax are repetition, they are not the same. Each are performed for different purpose and need to be performed. Some of them also let me track down the code easier if bugs report in future. Some of them also are repeated in different functions. Because those functions can be used dependently or independently. Therefore, some statements are needed to be checked in both for case the functions are used independently. For instance, in the begining of two functions ezSGCP and geneOntology it is checked that the dir is in c("under", "over"). Because ezSGCP is a wrapper of multiple functions including geneOntology and geneOntology function also can be applied independently. Therefore, in the beginning of each function I have checked if the this statement is valid. This acutally helps me to better maintain the package.
Important: Please include Bioconductor installation instructions using BiocManager. => Fixed
Note: Vignette includes motivation for submitting to Bioconductor as part of the abstract/intro of the main vignette. => I'm not sure if I understant correctly, I have added the information of the package installation through the BiocManager
Important: Please include Bioconductor installation instructions using BiocManager. => fixed
I'm pushing the modification into the repository.
Received a valid push on git.bioconductor.org; starting a build for commit id: 4f49cf88e9e4165c6d5fdbe19b2f11ef4b7d9dc4
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/SGCP
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 4b56cad8f0dd8556be1b6f30844f6c6b76969c60
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/SGCP
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Is it possible to rewrite GO_Genes <- hg@goDag@nodeData@data
by GO_genes <- graph::nodeData(GOstats::goDag(hg))
?
Please move back the BiocManager::install
section into your vignettes.
Received a valid push on git.bioconductor.org; starting a build for commit id: 3a43d22b29a0c3f21b3e179f7913aacbee8b7af6
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/SGCP
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
@jianhong Is it possible to rewrite GO_Genes <- hg@goDag@nodeData@data by GO_genes <- graph::nodeData(GOstats::goDag(hg)) =< Done
Please move back the BiocManager::install section into your vignettes. => Done
I think there is mis-communication about the BiocManager::install section. I mean please show the code
BiocManager::install('SGCP')
in your vignettes.
Received a valid push on git.bioconductor.org; starting a build for commit id: c84b43e7b909dfdc811fd14f539268a7eb88252a
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/SGCP
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
@jianhong I added the installation to the vignettes, but this cause the following error.
OK, try
```{r, eval=FALSE} library(BiocManager) BiocManager::install(c('SGCP', 'SummarizedExperiment', 'org.Hs.eg.db')) ```
Received a valid push on git.bioconductor.org; starting a build for commit id: 365fce8e1f9dfae7b7d8365199553d538df4c61b
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/SGCP
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 73b69c43e4440bfa5055b2c6c363077970cfdeac
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/SGCP
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
adding the installation causes the following warnings on the macOS WARNING: R CMD check exceeded 10 min requirement
Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor
Confirm the following by editing each check box to '[x]'
[x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[x] I understand Bioconductor Package Naming Policy and acknowledge Bioconductor may retain use of package name.
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.
[x] I am familiar with the Bioconductor code of conduct and agree to abide by it.
I am familiar with the essential aspects of Bioconductor software management, including:
For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.