gabrielodom / pathwayPCA

integrative pathway analysis with modern PCA methodology and gene selection
https://gabrielodom.github.io/pathwayPCA/
11 stars 2 forks source link

BiocCheck To Do List #64

Closed gabrielodom closed 5 years ago

gabrielodom commented 5 years ago

As of 20190201, we have 4 errors, 6 warnings, and 8 notes. pathwayPCA_BiocCheck_out20190201.txt

gabrielodom commented 5 years ago

ERRORs

  1. ERROR: Invalid package Version, see http://www.bioconductor.org/developers/how-to/version-numbering/. DONE: 20190201
  2. ERROR: No biocViews terms found. See http://bioconductor.org/developers/how-to/biocViews/. DONE: 20190201
  3. ERROR: At least 80% of man pages documenting exported objects must have runnable examples. The following pages do not: aespca.Rd, AESPCA_pVals.Rd, CheckAssay.Rd, CheckPwyColl.Rd, CheckSampleIDs.Rd, ControlFDR.Rd, coxTrain_fun.Rd, CreateOmics.Rd, CreateOmicsPathway.Rd, ExtractAESPCs.Rd, get_set_OmicsPathway.Rd, get_set_OmicsRegCateg.Rd, get_set_OmicsSurv.Rd, getPathPCLs.Rd, glmTrain_fun.Rd, GumbelMixpValues.Rd, IntersectOmicsPwyCollct.Rd, JoinPhenoAssay.Rd, lars.lsa.Rd, LoadOntoPCs.Rd, mysvd.Rd, olsTrain_fun.Rd, OptimGumbelMixParams.Rd, pathway_tControl.Rd, pathway_tScores.Rd, PermTestCateg.Rd, PermTestReg.Rd, PermTestSurv.Rd, permuteSamps.Rd, read_gmt.Rd, SubsetPathwayData.Rd, superpc.st.Rd, superpc.train.Rd, SuperPCA_pVals.Rd, TabulatepValues.Rd, write_gmt.Rd
  4. ERROR: Maintainer must register at the support site; visit https://support.bioconductor.org/accounts/signup/. DONE: 20190201
gabrielodom commented 5 years ago

WARNINGs

  1. WARNING: Update R version dependency from 2.10 to 3.5. DONE: 20190201
  2. WARNING: The following files are over 5MB in size: .git/objects/3a/e09e0fad7ca03f82015f7e69b4ea3f6395d0b4; .git/objects/pack/pack-0d60131b0c63dbfa6cb8cfcedbd69dc3cbdcd849.pack inst/extdata/KIRP_Surv_RNAseq_inner.RDS'
  3. WARNING: Use FALSE instead of F (found in 1 files): R/superPC_model_CoxPH.R (line 109, column 34). DONE: 20190201
  4. WARNING: The following files call library or require on pathwayPCA. This is not necessary. R/aesPC_extract_OmicsPath_PCs.R, R/aesPC_permtest_CoxPH.R, R/aesPC_permtest_GLM.R, R/aesPC_permtest_LM.R, R/superPC_wrapper.R. NOTE: these five library() calls are required to set up the parallel workers, unless I'm completely mistaken.
  5. WARNING: Import stats, utils in DESCRIPTION as well as NAMESPACE. DONE: 20190201
  6. WARNING: Add non-empty \value sections to the following man pages: man/write_gmt.Rd/. DONE: 20190201
gabrielodom commented 5 years ago

NOTEs

  1. NOTE: Consider adding unit tests. We strongly encourage them. See http://bioconductor.org/developers/how-to/unitTesting-guidelines/.
  2. NOTE: Avoid sapply(); use vapply() found in files: aesPC_permtest_CoxPH.R (line 179, column 30) aesPC_permtest_GLM.R (line 192, column 30) aesPC_permtest_LM.R (line 173, column 30) createOmics_CheckAssay.R (line 79, column 10) createOmics_CheckAssay.R (line 83, column 16) createOmics_TrimPathwayCollection.R (line 85, column 25) createOmics_TrimPathwayCollection.R (line 93, column 30) createOmics_TrimPathwayCollection.R (line 111, column 32) superPC_wrapper.R (line 297, column 30) superPC_wrapper.R (line 298, column 32) utils_adjust_and_sort_pValues.R (line 168, column 20) utils_load_test_data_onto_PCs.R (line 82, column 16) utils_read_gmt.R (line 63, column 20) utils_read_gmt.R (line 80, column 22) utils_write_gmt.R (line 80, column 15)
  3. NOTE: Avoid 1:...; use seq_len() or seq_along() found in files: aesPC_calculate_AESPCA.R (line 92, column 20) aesPC_calculate_AESPCA.R (line 98, column 12) aesPC_calculate_AESPCA.R (line 106, column 56) aesPC_calculate_AESPCA.R (line 117, column 14) aesPC_calculate_AESPCA.R (line 150, column 24) aesPC_calculate_AESPCA.R (line 151, column 16) aesPC_calculate_AESPCA.R (line 155, column 60) aesPC_calculate_AESPCA.R (line 178, column 12) aesPC_unknown_matrixNorm.R (line 64, column 13) createOmics_CheckPathwayCollection.R (line 60, column 33) createOmics_CheckSampleIDs.R (line 29, column 35) createOmics_JoinPhenoAssay.R (line 64, column 29) superPC_model_CoxPH.R (line 105, column 13) superPC_model_CoxPH.R (line 108, column 12) superPC_model_CoxPH.R (line 146, column 10) superPC_model_CoxPH.R (line 153, column 12) superPC_model_CoxPH.R (line 174, column 12) superPC_model_CoxPH.R (line 185, column 12) superPC_model_CoxPH.R (line 190, column 11) superPC_model_CoxPH.R (line 204, column 12) superPC_model_tStats.R (line 122, column 24) superPC_model_tStats.R (line 156, column 36) superPC_model_tStats.R (line 158, column 14) superPC_model_tStats.R (line 163, column 49) superPC_model_tStats.R (line 172, column 54) superPC_model_tStats.R (line 179, column 55) superPC_model_tStats.R (line 204, column 46) superPC_modifiedSVD.R (line 65, column 18) superPC_modifiedSVD.R (line 66, column 16) superPC_modifiedSVD.R (line 67, column 18) superPC_permuteSamples.R (line 106, column 14) utils_adjust_and_sort_pValues.R (line 156, column 20) utils_multtest_pvalues.R (line 241, column 26) utils_multtest_pvalues.R (line 267, column 14) utils_multtest_pvalues.R (line 316, column 14) utils_multtest_pvalues.R (line 334, column 28) utils_write_gmt.R (line 75, column 20)
  4. NOTE: Consider adding a NEWS file, so your package news will be included in Bioconductor release announcements. NOTE: Bioconductor does not parse .md files; so NEWS must be plain text or .Rd. I don't know how to convert .md to .Rd yet.
  5. NOTE: Consider shorter lines; 389 lines (3%) are > 80 characters long.
  6. NOTE: Consider 4 spaces instead of tabs; 4 lines (0%) contain tabs. DONE: 20190201
  7. NOTE: Consider multiples of 4 spaces for line indents, 3179 lines(26%) are not.
  8. NOTE: Cannot determine whether maintainer is subscribed to the bioc-devel mailing list (requires admin credentials). Subscribe here: https://stat.ethz.ch/mailman/listinfo/bioc-devel
gabrielodom commented 5 years ago

I need to work on the examples and the files sizes next.

gabrielodom commented 5 years ago

I've added examples to some of the functions, but they still want more:

aespca.Rd, CheckAssay.Rd, CheckPwyColl.Rd, CheckSampleIDs.Rd, ControlFDR.Rd, coxTrain_fun.Rd,
CreateOmicsPathway.Rd, ExtractAESPCs.Rd, glmTrain_fun.Rd, GumbelMixpValues.Rd,
IntersectOmicsPwyCollct.Rd, JoinPhenoAssay.Rd, lars.lsa.Rd, mysvd.Rd, olsTrain_fun.Rd,
OptimGumbelMixParams.Rd, pathway_tControl.Rd, pathway_tScores.Rd, PermTestCateg.Rd,
PermTestReg.Rd, PermTestSurv.Rd, permuteSamps.Rd, superpc.st.Rd, superpc.train.Rd,
TabulatepValues.Rd

I think I should start on any function in UpperCamel, as these are (sort-of) user facing.

gabrielodom commented 5 years ago

I've added examples to the AESPCA-related functions: ExtractAESPCs, PermTestCateg, PermTestReg, PermTestSurv, ControlFDR, and TabulatepValues.

The SuperPCA-related functions are OptimGumbelMixParams and GumbelMixpValues. The Omics-object creation functions are CheckAssay.Rd, CheckPwyColl.Rd, CheckSampleIDs.Rd, JoinPhenoAssay and IntersectOmicsPwyCollct

gabrielodom commented 5 years ago

We are down to mostly internal functions now that need examples: aespca.Rd, coxTrain_fun.Rd, CreateOmicsPathway.Rd, glmTrain_fun.Rd, lars.lsa.Rd, mysvd.Rd, olsTrain_fun.Rd, pathway_tControl.Rd, pathway_tScores.Rd, permuteSamps.Rd, superpc.st.Rd, and superpc.train.Rd

gabrielodom commented 5 years ago

We still aren't to 80% yet, which is super annoying. We have aespca.Rd, coxTrain_fun.Rd, glmTrain_fun.Rd, lars.lsa.Rd, mysvd.Rd, olsTrain_fun.Rd, superpc.st.Rd, and superpc.train.Rd left. The only high-level function left without an example is aespca(), and that's a few layers down.

gabrielodom commented 5 years ago

WARNINGs

  1. WARNING: The following files are over 5MB in size: .git/objects/3a/e09e0fad7ca03f82015f7e69b4ea3f6395d0b4; .git/objects/pack/pack-0d60131b0c63dbfa6cb8cfcedbd69dc3cbdcd849.pack inst/extdata/KIRP_Surv_RNAseq_inner.RDS'
  2. WARNING: The following files call library or require on pathwayPCA. This is not necessary. R/aesPC_extract_OmicsPath_PCs.R, R/aesPC_permtest_CoxPH.R, R/aesPC_permtest_GLM.R, R/aesPC_permtest_LM.R, R/superPC_wrapper.R. NOTE: these five library() calls are required to set up the parallel workers, unless I'm completely mistaken.
gabrielodom commented 5 years ago

NOTEs

  1. NOTE: Consider adding unit tests. We strongly encourage them. See http://bioconductor.org/developers/how-to/unitTesting-guidelines/.
  2. NOTE: Avoid sapply(); use vapply() found in files: aesPC_permtest_CoxPH.R (line 207, column 30) aesPC_permtest_GLM.R (line 220, column 30) aesPC_permtest_LM.R (line 201, column 30) createOmics_CheckAssay.R (line 83, column 10) createOmics_CheckAssay.R (line 87, column 16) createOmics_TrimPathwayCollection.R (line 95, column 25) createOmics_TrimPathwayCollection.R (line 103, column 30) createOmics_TrimPathwayCollection.R (line 121, column 32) superPC_wrapper.R (line 295, column 30) superPC_wrapper.R (line 296, column 32) utils_adjust_and_sort_pValues.R (line 205, column 20) utils_load_test_data_onto_PCs.R (line 83, column 16) utils_read_gmt.R (line 63, column 20) utils_read_gmt.R (line 80, column 22) utils_write_gmt.R (line 80, column 15)
  3. NOTE: Consider adding runnable examples to the following man pages which document exported objects: coxTrain_fun.Rd, glmTrain_fun.Rd, lars.lsa.Rd, mysvd.Rd, olsTrain_fun.Rd, superpc.st.Rd, superpc.train.Rd
  4. NOTE: Consider shorter lines; 389 lines (3%) are > 80 characters long.
  5. NOTE: Consider multiples of 4 spaces for line indents, 3179 lines(26%) are not.
  6. NOTE: Cannot determine whether maintainer is subscribed to the bioc-devel mailing list (requires admin credentials). Subscribe here: https://stat.ethz.ch/mailman/listinfo/bioc-devel
lxw391 commented 5 years ago

For Warnings #2 above, instead of saving the entire dataset in the package, in the vignette, we can maybe use R package RTCGA to extract expression values

See an example at http://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/77-facilitating-exploratory-data-visualization-application-to-tcga-genomic-data/

gabrielodom commented 5 years ago

@lizhongliu1996 will work on WARNING 1. I will finish up my work on NOTE 2 and then work on WARNING 2, using @lxw391's suggestion.

gabrielodom commented 5 years ago

ERRORs from the Bioconductor bot:

ERROR: System Files found that should not be git tracked: pathwayPCA.Rproj ERROR: Running examples in 'pathwayPCA-Ex.R' failed. The error most likely occurred in: Error in optim(par = initialVals, fn = gumbelMixture, gr = gumbelMix_score, : non-finite value supplied by optim Calls: SuperPCA_pVals -> SuperPCA_pVals -> OptimGumbelMixParams -> optim ERROR: Package Source tarball exceeds Bioconductor size requirement. Package Size: 17.8203 MB Size Requirement: 5.0000 MB ERROR: Maintainer must subscribe to the bioc-devel mailing list. Subscribe here: https://stat.ethz.ch/mailman/listinfo/bioc-devel

gabrielodom commented 5 years ago

WARNINGs from the bot:

WARNING: Update R version dependency from 3.5 to 3.6. WARNING: The following files are over 5MB in size: 'inst/extdata/KIRP_RNAseq_WPsubset_20190207.RDS' WARNING: The following files call library or require on pathwayPCA. This is not necessary. R/aesPC_extract_OmicsPath_PCs.R, R/aesPC_permtest_CoxPH.R, R/aesPC_permtest_GLM.R, R/aesPC_permtest_LM.R, R/superPC_wrapper.R

gabrielodom commented 5 years ago

Link to the Bioconductor submission: https://github.com/Bioconductor/Contributions/issues/1000

gabrielodom commented 5 years ago

New Bioconductor build report: http://bioconductor.org/spb_reports/pathwayPCA_buildreport_20190214092513.html

gabrielodom commented 5 years ago

ERRORs: (all OS) ERROR: Maintainer must subscribe to the bioc-devel mailing list. Subscribe here: https://stat.ethz.ch/mailman/listinfo/bioc-devel (Mac & Linux) ERROR: Package Source tarball exceeds Bioconductor size requirement. Package Size: 5.2239 MB; Size Requirement: 5.0000 MB

@lizhongliu1996, is there anything else we can cut? The Supplement2-Importing_Data vignette is over 5MB as an html file, so maybe start here? There are only two images in this vignette, but maybe we can cut the resolution?

gabrielodom commented 5 years ago

WARNINGs: (Windows & Linux) WARNING: The following files are over 5MB in size: 'inst/doc/Supplement2-Importing_Data.html' (all OS) WARNING: The following files call library or require on pathwayPCA. This is not necessary. R/aesPC_extract_OmicsPath_PCs.R, R/aesPC_permtest_CoxPH.R, R/aesPC_permtest_GLM.R, R/aesPC_permtest_LM.R, R/superPC_wrapper.R

lizhongliu1996 commented 5 years ago

BiocCheck results on Linux workstation screenshot from 2019-02-15 11-29-32

gabrielodom commented 5 years ago

We received confirmation that we were added to the Bioc-Devel mailing list on Tuesday afternoon, 19 February, at 15:36 Eastern. We will make some additional edits to clean up the NOTEs, and trigger a re-build tomorrow.

gabrielodom commented 5 years ago

Updates:

Updated BiocCheck output: image

gabrielodom commented 5 years ago

We are at WARNINGs only for Bioconductor: http://bioconductor.org/spb_reports/pathwayPCA_buildreport_20190220100348.html

One WARNING and four NOTEs: WARNING: The following files call library or require on pathwayPCA. This is not necessary. R/aesPC_extract_OmicsPath_PCs.R, R/aesPC_permtest_CoxPH.R, R/aesPC_permtest_GLM.R, R/aesPC_permtest_LM.R, R/superPC_wrapper.R NOTE: Recommended function length <= 50 lines. There are 23 functions > 50 lines. NOTE: Consider adding unit tests. We strongly encourage them. NOTE: Consider shorter lines; 396 lines (3%) are > 80 characters long. NOTE: Consider multiples of 4 spaces for line indents, 3552 lines (26%) are not.

gabrielodom commented 5 years ago

Now that we have the Bioconductor team review, we will migrate this issue to Issue #68.