Closed msweetlove closed 2 years ago
Thanks for submitting to rOpenSci, our editors and @ropensci-review-bot will reply soon. Type @ropensci-review-bot help
for help.
:rocket:
Editor check started
:wave:
@msweetlove The editor check failed because your DESCRIPTION
file has incorrectly-formatted dependency lists. They must have spaces after the ">="
symbols, so like:
worrms (>= 0.4.2)
and not in current form like
worrms (>=0.4.2)
The RecordLinkage
entry also needs a space before the opening bracket. Please ping here once you've updated that and we'll re-run the checks. Thanks!
@mpadge the spaces have been added to the DESCRIPTION file in the repo: https://github.com/biodiversity-aq/OmicsMetaData
git hash: 37089da9
Important: All failing checks above must be addressed prior to proceeding
Package License: GNU General Public License v3 (GLP-3.0) https://www.gnu.org/licenses/gpl-3.0.en.html
This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.
The package has: - code in R (100% in 10 files) and - 1 authors - 5 vignettes - 10 internal data files - 5 imported packages - 37 exported functions (median 60 lines of code) - 44 non-exported functions in R (median 49 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:-----------------------|-----:|----------:|:----------| |files_R | 10| 55.4| | |files_vignettes | 0| 0.0|TRUE | |files_tests | 8| 85.7| | |loc_R | 2829| 89.3| | |loc_tests | 465| 70.5| | |num_vignettes | 5| 97.5|TRUE | |data_size_total | 50800| 79.9| | |data_size_median | 2304| 69.4| | |n_fns_r | 81| 64.4| | |n_fns_r_exported | 37| 81.3| | |n_fns_r_not_exported | 44| 55.1| | |n_fns_per_file_r | 4| 56.5| | |num_params_per_fn | 2| 10.7| | |loc_per_fn_r | 53| 95.3|TRUE | |loc_per_fn_r_exp | 60| 85.1| | |loc_per_fn_r_not_exp | 50| 95.3|TRUE | |rel_whitespace_R | 11| 80.4| | |rel_whitespace_tests | 20| 87.8| | |doclines_per_fn_exp | 35| 41.6| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 66| 68.6| | ---
Interactive network visualisation of calls between objects in package can be viewed by clicking here
goodpractice
and other checks--- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following error: 1. checking examples ... ERROR Running examples in ‘OmicsMetaData-Ex.R’ failed The error most likely occurred in: > ### Name: sync.metadata.sequenceFiles > ### Title: Check if all samples in a dataframe have sequence data > ### Aliases: sync.metadata.sequenceFiles > > ### ** Examples > > \donttrun{ Error: unexpected symbol in "\donttrun" Execution halted R CMD check generated the following warnings: 1. checking whether package ‘OmicsMetaData’ can be installed ... WARNING Found the following significant warnings: Warning: /tmp/RtmpCIJwfC/file39e15a33ceb0/OmicsMetaData.Rcheck/00_pkg_src/OmicsMetaData/man/sync.metadata.sequenceFiles.Rd:30: unknown macro '\donttrun' See ‘/tmp/RtmpCIJwfC/file39e15a33ceb0/OmicsMetaData.Rcheck/00install.out’ for details. 2. checking R files for non-ASCII characters ... WARNING Found the following file with non-ASCII characters: General_Utils.R Portable packages must use only ASCII characters in their R code, except perhaps in comments. Use \uxxxx escapes for other characters. 3. checking dependencies in R code ... WARNING '::' or ':::' import not declared from: ‘RCurl’ Namespaces in Imports field not imported from: ‘Orcs’ ‘xml2’ All declared Imports should be used. Packages in Depends field not imported from: ‘mapview’ ‘RecordLinkage’ ‘rgbif’ ‘tidyr’ These packages need to be imported from (in the NAMESPACE file) for when this namespace is loaded but not attached. package 'methods' is used but not declared 4. checking Rd files ... WARNING prepare_Rd: man/sync.metadata.sequenceFiles.Rd:30: unknown macro '\donttrun' 5. checking for missing documentation entries ... WARNING Undocumented code objects: ‘ENA_allowed_terms’ ‘ENA_checklistAccession’ ‘ENA_geoloc’ ‘ENA_instrument’ ‘ENA_select’ ‘ENA_strat’ ‘TaxIDLib’ ‘TermsLib’ ‘TermsSyn’ ‘TermsSyn_DwC’ Undocumented data sets: ‘ENA_allowed_terms’ ‘ENA_checklistAccession’ ‘ENA_geoloc’ ‘ENA_instrument’ ‘ENA_select’ ‘ENA_strat’ ‘TaxIDLib’ ‘TermsLib’ ‘TermsSyn’ ‘TermsSyn_DwC’ All user-level objects in a package should have documentation entries. See chapter ‘Writing R documentation files’ in the ‘Writing R Extensions’ manual. 6. checking for code/documentation mismatches ... WARNING Functions or methods with usage in documentation object 'dataQC.TaxonListFromData' but not in code: ‘find.sampleTaxon’ Codoc mismatches from documentation object 'wideTable.to.eMoF': wideTable.to.eMoF Code: function(metadata.object, variables = NA) Docs: function(dataset) Argument names in code not in docs: metadata.object variables Argument names in docs not in code: dataset Mismatches in argument names: Position: 1 Code: metadata.object Docs: dataset 7. checking Rd \usage sections ... WARNING Objects in \usage without \alias in documentation object 'dataQC.TaxonListFromData': ‘find.sampleTaxon’ Documented arguments not in \usage in documentation object 'dataQC.completeTaxaNamesFromRegistery': ‘taxBackbone’ Undocumented arguments in documentation object 'prep.metadata.ENA' ‘library.layout’ ‘library.strategy’ ‘library.selection’ Documented arguments not in \usage in documentation object 'prep.metadata.ENA': ‘library_layout’ ‘library_strategy’ ‘library_selection’ Undocumented arguments in documentation object 'show,DwC.event-method' ‘object’ Undocumented arguments in documentation object 'show,DwC.occurrence-method' ‘object’ Undocumented arguments in documentation object 'show,MIxS.metadata-method' ‘object’ Undocumented arguments in documentation object 'wideTable.to.eMoF' ‘dataset’ Documented arguments not in \usage in documentation object 'wideTable.to.eMoF': ‘metadata.object’ ‘variables’ Bad \usage lines found in documentation object 'FileNames.to.Table': FileNames.to.Table (file.dir, paired=TRUE, seq.file.extension=".fastq.gz", pairedEnd.extension=c("_1", "_2") Bad \usage lines found in documentation object 'dataQC.DwC': DataQC.DwC(Event=NA, Occurrence=NA, eMoF=NA, EML.url=NA, out.type="event", ask.input=TRUE)) Bad \usage lines found in documentation object 'sync.metadata.sequenceFiles': sync.metadata.sequenceFiles <- function(Names, file.dir=NULL, paired=TRUE, seq.file.extension=".fastq.gz", pairedEnd.extension=c("_1", "_2")) Functions with \usage entries need to have the appropriate \alias entries, and all their arguments documented. The \usage entries must correspond to syntactically valid R code. See chapter ‘Writing R documentation files’ in the ‘Writing R Extensions’ manual. 8. checking for unstated dependencies in examples ... WARNING Warning: parse error in file 'lines': 2: unexpected symbol 578: 579: \donttrun ^ 9. checking files in ‘vignettes’ ... WARNING Files in the 'vignettes' directory but no files in 'inst/doc': ‘Background.Rmd’, ‘General_Overview.Rmd’, ‘Metadata_standardization.Rmd’, ‘Perpare_data_for_archiving.Rmd’, ‘Retrieving_online_data.Rmd’ Package has no Sweave vignette sources and no VignetteBuilder field. R CMD check generated the following notes: 1. checking DESCRIPTION meta-information ... NOTE Malformed Title field: should not end in a period. Malformed Description field: should contain one or more complete sentences. Non-standard license specification: GNU General Public License v3 (GLP-3.0) https://www.gnu.org/licenses/gpl-3.0.en.html Standardizable: FALSE 2. checking R code for possible problems ... NOTE combine.data: no visible global function definition for ‘new’ commonTax.to.NCBI.TaxID: no visible binding for global variable ‘TaxIDLib’ dataQC.DwC: no visible global function definition for ‘new’ dataQC.DwC_general: no visible binding for global variable ‘TermsLib’ dataQC.findNames: no visible binding for global variable ‘TermsSyn’ dataQC.MIxS: no visible binding for global variable ‘TermsSyn’ dataQC.MIxS: no visible binding for global variable ‘TermsLib’ dataQC.MIxS: no visible binding for global variable ‘ENA_checklistAccession’ dataQC.MIxS: no visible global function definition for ‘new’ dataQC.TermsCheck: no visible binding for global variable ‘TermsLib’ download.sequences.INSDC: no visible global function definition for ‘download.file’ download.sequences.INSDC: no visible global function definition for ‘read.table’ eMoF.to.wideTable: no visible binding for global variable ‘eventID’ eMoF.to.wideTable: no visible binding for global variable ‘measurementValue’ eMoF.to.wideTable: no visible binding for global variable ‘occurrenceID’ FileNames.to.Table: no visible binding for global variable ‘rv_out’ get.BioProject.metadata.INSDC: no visible global function definition for ‘download.file’ get.BioProject.metadata.INSDC: no visible global function definition for ‘read.csv’ get.ENAName: no visible binding for global variable ‘TermsLib’ get.sample.attributes.INSDC: no visible global function definition for ‘read_xml’ get.sample.attributes.INSDC: no visible global function definition for ‘xml_find_all’ get.sample.attributes.INSDC: no visible global function definition for ‘xml_attr’ get.sample.attributes.INSDC: no visible global function definition for ‘xml_text’ get.sample.attributes.INSDC: no visible global function definition for ‘as_list’ prep.metadata.ENA: no visible binding for global variable ‘ENA_checklistAccession’ prep.metadata.ENA: no visible binding for global variable ‘ENA_geoloc’ prep.metadata.ENA: no visible global function definition for ‘separate’ prep.metadata.ENA: no visible binding for global variable ‘ENA_instrument’ prep.metadata.ENA: no visible binding for global variable ‘ENA_select’ prep.metadata.ENA: no visible binding for global variable ‘ENA_strat’ prep.metadata.ENA: no visible global function definition for ‘write.table’ term.definition: no visible binding for global variable ‘TermsLib’ term.definition: no visible binding for global variable ‘TermsSyn’ wideTable.to.eMoF: no visible global function definition for ‘gather’ wideTable.to.eMoF: no visible binding for global variable ‘measurementType’ wideTable.to.eMoF: no visible binding for global variable ‘measurementValue’ write.MIxS: no visible global function definition for ‘write.csv’ show,DwC.event: no visible binding for global variable ‘EML.url’ show,DwC.occurrence: no visible binding for global variable ‘EML.url’ Undefined global functions or variables: as_list download.file EML.url ENA_checklistAccession ENA_geoloc ENA_instrument ENA_select ENA_strat eventID gather measurementType measurementValue new occurrenceID read_xml read.csv read.table rv_out separate TaxIDLib TermsLib TermsSyn write.csv write.table xml_attr xml_find_all xml_text Consider adding importFrom("methods", "new") importFrom("utils", "download.file", "read.csv", "read.table", "write.csv", "write.table") to your NAMESPACE file (and ensure that your DESCRIPTION Imports field contains 'methods'). R CMD check generated the following check_fails: 1. cyclocomp 2. no_description_depends 3. description_url 4. description_bugreports 5. rcmdcheck_malformed_title_or_description 6. rcmdcheck_r_files_are_ascii 7. rcmdcheck_undeclared_imports 8. rcmdcheck_undefined_globals 9. rcmdcheck_missing_docs 10. rcmdcheck_code_docs_mismatch 11. rcmdcheck_unstated_dependencies_in_examples 12. rcmdcheck_examples_run 13. rcmdcheck_examples_run_without_warnings 14. rcmdcheck_significant_compilation_warnings #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 47.65 The following files are not completely covered by tests: file | coverage --- | --- R/Classes_Libraries.R | 20.25% R/DataFormat_Utils.R | 59.73% R/DataQC_Main_DwCgeneral.R | 32.12% R/DataQC_Main_MIxS.R | 50% R/DataQC_Utils.R | 55.99% R/Format_sequenceData_ENA.R | 43.8% R/Get_SequenceData_INSDC.R | 29.29% R/OmicsMetaData.R | 0% #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) The following functions have cyclocomplexity >= 15: function | cyclocomplexity --- | --- prep.metadata.ENA | 144 dataQC.MIxS | 117 dataQC.DwC_general | 73 dataQC.TermsCheck | 37 dataQC.dateCheck | 35 dataQC.eventStructure | 35 dataQC.guess.env_package.from.data | 35 combine.data.frame | 30 combine.data | 26 dataQC.generate.footprintWKT | 26 dataQC.LatitudeLongitudeCheck | 25 dataQC.findNames | 23 download.sequences.INSDC | 22 dataQC.DwC | 20 sync.metadata.sequenceFiles | 16 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 875 potential issues: message | number of times --- | --- Avoid 1:length(...) expressions, use seq_len. | 16 Avoid 1:ncol(...) expressions, use seq_len. | 9 Avoid 1:nrow(...) expressions, use seq_len. | 15 Avoid using sapply, consider vapply instead, that's type safe | 32 Lines should not be more than 80 characters. | 795 Use <-, not =, for assignment. | 8
|package |version | |:--------|:--------| |pkgstats |0.0.2.16 | |pkgcheck |0.0.2.83 |
Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.
Thank you for your submission, @msweetlove! It looks like there are still a number of things needed to get this package ready for review. Please look at the report above. The first few are simple metadata components. However, we do need the package to have CI checks, >75% code coverage unless there are specific reasons, and a clean R CMD check.
Let us know when you've made these updates and we can proceed, and do ask any questions you have!
I note that with your spatial dependencies CI setup can be a little finicky, @mpadge can point you to resources if you need them.
Hi @noamross and @mpadge, I went through the list issues, and they should be fixed and updated now. Code coverage is now 78.91%. I'm not completely sure for the CI issue though (my knowledge in that area is rather limited). I added GitHub actions to the package, and I was wondering if this is enough? If not, I could use some help here. Cheers Maxime
@ropensci-review-bot check package
Thanks, about to send the query.
:rocket:
Editor check started
:wave:
git hash: 853aeabe
Important: All failing checks above must be addressed prior to proceeding
Package License: GPL (>= 3)
This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.
The package has: - code in R (100% in 11 files) and - 1 authors - 5 vignettes - 10 internal data files - 10 imported packages - 37 exported functions (median 61 lines of code) - 46 non-exported functions in R (median 48 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:-----------------------|-----:|----------:|:----------| |files_R | 11| 59.3| | |files_vignettes | 0| 0.0|TRUE | |files_tests | 10| 88.6| | |loc_R | 2877| 89.5| | |loc_tests | 1281| 87.7| | |num_vignettes | 5| 97.5|TRUE | |data_size_total | 50800| 79.9| | |data_size_median | 2304| 69.4| | |n_fns_r | 83| 65.1| | |n_fns_r_exported | 37| 81.3| | |n_fns_r_not_exported | 46| 56.5| | |n_fns_per_file_r | 4| 53.4| | |num_params_per_fn | 2| 10.7| | |loc_per_fn_r | 50| 94.7| | |loc_per_fn_r_exp | 61| 85.4| | |loc_per_fn_r_not_exp | 48| 94.9| | |rel_whitespace_R | 12| 81.7| | |rel_whitespace_tests | 16| 93.7| | |doclines_per_fn_exp | 37| 44.9| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 66| 68.6| | ---
Interactive network visualisation of calls between objects in package can be viewed by clicking here
goodpractice
and other checks#### 3a. Continuous Integration Badges [![github](https://github.com/biodiversity-aq/OmicsMetaData/workflows/R-CMD-check/badge.svg)](https://github.com/biodiversity-aq/OmicsMetaData/actions) **GitHub Workflow Results** |name |conclusion |sha |date | |:-----------|:----------|:------|:----------| |R-CMD-check |success |853aea |2021-10-25 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following note: 1. checking dependencies in R code ... NOTE Namespace in Imports field not imported from: ‘Orcs’ All declared Imports should be used. R CMD check generated the following check_fails: 1. cyclocomp 2. no_description_date 3. rcmdcheck_imports_not_imported_from #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 78.91 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) The following functions have cyclocomplexity >= 15: function | cyclocomplexity --- | --- prep.metadata.ENA | 145 dataQC.MIxS | 117 dataQC.DwC_general | 75 dataQC.TermsCheck | 37 dataQC.dateCheck | 35 dataQC.eventStructure | 35 dataQC.guess.env_package.from.data | 35 combine.data.frame | 30 combine.data | 26 dataQC.generate.footprintWKT | 26 dataQC.LatitudeLongitudeCheck | 25 dataQC.findNames | 23 download.sequences.INSDC | 23 dataQC.DwC | 20 sync.metadata.sequenceFiles | 16 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 1055 potential issues: message | number of times --- | --- Avoid 1:length(...) expressions, use seq_len. | 16 Avoid 1:ncol(...) expressions, use seq_len. | 9 Avoid 1:nrow(...) expressions, use seq_len. | 17 Avoid using sapply, consider vapply instead, that's type safe | 32 Lines should not be more than 80 characters. | 973 Use <-, not =, for assignment. | 8
|package |version | |:--------|:--------| |pkgstats |0.0.2.16 | |pkgcheck |0.0.2.86 |
Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.
@noamross and @mpadge: added an example for the function commonTax.to.NCBI.TaxID
@ropensci-review-bot assign @jooolia as editor
Assigned! @jooolia is now the editor
@ropensci-review-bot check package
Thanks, about to send the query.
:rocket:
Editor check started
:wave:
git hash: 0d8728ac
Important: All failing checks above must be addressed prior to proceeding
Package License: GPL (>= 3)
This package features some noteworthy statistical properties which may need to be clarified by a handling editor prior to progressing.
The package has: - code in R (100% in 11 files) and - 1 authors - 5 vignettes - 10 internal data files - 10 imported packages - 37 exported functions (median 61 lines of code) - 46 non-exported functions in R (median 48 lines of code) --- Statistical properties of package structure as distributional percentiles in relation to all current CRAN packages The following terminology is used: - `loc` = "Lines of Code" - `fn` = "function" - `exp`/`not_exp` = exported / not exported The final measure (`fn_call_network_size`) is the total number of calls between functions (in R), or more abstract relationships between code objects in other languages. Values are flagged as "noteworthy" when they lie in the upper or lower 5th percentile. |measure | value| percentile|noteworthy | |:-----------------------|-----:|----------:|:----------| |files_R | 11| 59.3| | |files_vignettes | 0| 0.0|TRUE | |files_tests | 10| 88.6| | |loc_R | 2877| 89.5| | |loc_tests | 1281| 87.7| | |num_vignettes | 5| 97.5|TRUE | |data_size_total | 50800| 79.9| | |data_size_median | 2304| 69.4| | |n_fns_r | 83| 65.1| | |n_fns_r_exported | 37| 81.3| | |n_fns_r_not_exported | 46| 56.5| | |n_fns_per_file_r | 4| 53.4| | |num_params_per_fn | 2| 10.7| | |loc_per_fn_r | 50| 94.7| | |loc_per_fn_r_exp | 61| 85.4| | |loc_per_fn_r_not_exp | 48| 94.9| | |rel_whitespace_R | 12| 81.7| | |rel_whitespace_tests | 16| 93.7| | |doclines_per_fn_exp | 37| 44.9| | |doclines_per_fn_not_exp | 0| 0.0|TRUE | |fn_call_network_size | 66| 68.6| | ---
Interactive network visualisation of calls between objects in package can be viewed by clicking here
goodpractice
and other checks#### 3a. Continuous Integration Badges [![github](https://github.com/biodiversity-aq/OmicsMetaData/workflows/R-CMD-check/badge.svg)](https://github.com/biodiversity-aq/OmicsMetaData/actions) **GitHub Workflow Results** |name |conclusion |sha |date | |:-----------|:----------|:------|:----------| |R-CMD-check |success |0d8728 |2021-10-25 | --- #### 3b. `goodpractice` results #### `R CMD check` with [rcmdcheck](https://r-lib.github.io/rcmdcheck/) R CMD check generated the following note: 1. checking dependencies in R code ... NOTE Namespace in Imports field not imported from: ‘Orcs’ All declared Imports should be used. R CMD check generated the following check_fails: 1. cyclocomp 2. no_description_date 3. rcmdcheck_imports_not_imported_from #### Test coverage with [covr](https://covr.r-lib.org/) Package coverage: 78.91 #### Cyclocomplexity with [cyclocomp](https://github.com/MangoTheCat/cyclocomp) The following functions have cyclocomplexity >= 15: function | cyclocomplexity --- | --- prep.metadata.ENA | 145 dataQC.MIxS | 117 dataQC.DwC_general | 75 dataQC.TermsCheck | 37 dataQC.dateCheck | 35 dataQC.eventStructure | 35 dataQC.guess.env_package.from.data | 35 combine.data.frame | 30 combine.data | 26 dataQC.generate.footprintWKT | 26 dataQC.LatitudeLongitudeCheck | 25 dataQC.findNames | 23 download.sequences.INSDC | 23 dataQC.DwC | 20 sync.metadata.sequenceFiles | 16 #### Static code analyses with [lintr](https://github.com/jimhester/lintr) [lintr](https://github.com/jimhester/lintr) found the following 1055 potential issues: message | number of times --- | --- Avoid 1:length(...) expressions, use seq_len. | 16 Avoid 1:ncol(...) expressions, use seq_len. | 9 Avoid 1:nrow(...) expressions, use seq_len. | 17 Avoid using sapply, consider vapply instead, that's type safe | 32 Lines should not be more than 80 characters. | 973 Use <-, not =, for assignment. | 8
|package |version | |:--------|:--------| |pkgstats |0.0.2.16 | |pkgcheck |0.0.2.86 |
Processing may not proceed until the items marked with :heavy_multiplication_x: have been resolved.
Hi @msweetlove, looks good to me. I see that you have added the example to commonTax.to.NCBI.TaxID.Rd but not updated the documentation. I will proceed with looking for reviewers but it would be great if you could update your docs. I think there are many of the {goodpractice} lintr comments that could be incorporated (e.g. regarding assignment, seq_len, vapply and long lines).
Thanks, Julia
Thanks, I was already wondering where that commonTax.to.NCBI.TaxID example had gone to... The documentation has been updated now! Cheers, Maxime
@ropensci-review-bot add @orchid00 to reviewers
@orchid00 added to the reviewers list. Review due date is 2021-12-01. Thanks @orchid00 for accepting to review! Please refer to our reviewer guide.
@orchid00: If you haven't done so, please fill this form for us to update our reviewers records.
Hello @orchid00 ! thanks for agreeing to review. I am still looking for a second reviewer and we did agree that your review would have a later due date of December 15th (however if it is done earlier that is great).
Dear @msweetlove , I am still looking for a second reviewer. Thanks, Julai
@ropensci-review-bot add @ginberg to reviewers
@ginberg added to the reviewers list. Review due date is 2021-12-26. Thanks @ginberg for accepting to review! Please refer to our reviewer guide.
@ginberg: If you haven't done so, please fill this form for us to update our reviewers records.
Dear @ginberg thanks for agreeing to review!
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).Estimated hours spent reviewing: 4
README
Check:
Vignettes:
Automated tests:
Packaging guidelines:
Thanks very much @ginberg for the review. (sorry for my slow response)
Hi @orchid00, do you think it will be possible to submit your review soon? Thanks, Julia
I'm sorry, I was not able to do the review, I prefer to step out. I was on leave first, then I got ill. Now back to too many things to accomodate for.
Hi @orchid00, Thanks for letting me know. Hope you are feeling better and wishing you the best.
I will look for another reviewer @msweetlove. Thanks!
@ropensci-review-bot add @cpalmer718 to reviewers
Dear @cpalmer718, thank you for agreeing to review! The due date for your review is 2022-02-19. Please refer to our reviewer guide. If you have any questions feel free to ask here or via email. Thanks, Julia
Hi @msweetlove @jooolia please see below for my review. I've added commentary at the bottom to explain some of it. I'm trying to return this early because I wasn't able to clearly evaluate parts of the package on my system, so we may need to iterate a bit.
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).Estimated hours spent reviewing: 10
enc2utf8
fixes the problem.Error in download.file(ftp_url, destfile = file.path(destination.path, : scheme not supported in URL 'ftp.sra.ebi.ac.uk/vol1/fastq/SRR298/004/SRR2980684/SRR2980684_1.fastq.gz'.
Apparently the "auto" setting is causing Windows to choose a download method that doesn't support ftp. This is seemingly resolved by setting the method parameter in download.file
to "libcurl" as it is available on my system. I'd recommend going through the download.file
documentation and coming up with a logic chain that hopefully guarantees the download will succeed for any system (which may involve requiring non-R system dependencies), or otherwise exposing the method parameter to the user so they can override the system's choice without editing the library's source code.libudunits2
, gdal
, etc.)?
author: "{name}"
tags to YAML headers of vignette .Rmd filesworkingDir
is defined in Metadata_standardization
but then not used, and in the final statement write.MIxS
function emits relative to working directoryRetrieving online data
, the example in block download_sequences
emits the warning the metadata will be retruned to the Console If you did assign the output of this function to an R-object (using "<-"): better abort and restart now
; since the result is assigned to a variable in the vignette example, this is really confusing as a new user..DS_Store
), should be removed from the repo. The appropriate file patterns should be added to .gitignore
at top level to prevent their addition to the repo.master
to main
or default
as preferred.R/sysdata.rda
has something called "MarsLib" in it; is that expected to be there? It seems like several data structures are being loaded directly from saved data files, which are provided as jagged data frames in the package namespace. It works, but it's somewhat unwieldy to work with. It would be nice to have some of the prestock library content available within the library as YAML files or something. Regardless, it really seems like the jagged data frames should be rather lists of vectors, so that when the user accesses one of the shorter vectors, they don't accidentally end up using the many empty string entries at the end of the vector.
styler
and lintr
. In particular:
sapply
and some other similar things, and I think that would be good. In particular, there are some statements (e.g. unlist(unname(sapply(paste))))
that I think could be made much more straightforward by using, for example, paste's vectorized behavior. This doesn't seem to cause any bugs that I've noticed, but it makes it difficult to read and maintain, and depending on the size of your input datasets, some of the for loops might cause some issues. I don't think it really will cause performance degradation at realistic dataset sizes, but it would just make things a lot cleaner and consistent with packaging best practices.
Thank you for inviting me to review! In particular with the Windows system issues, I'd really enjoy taking another look once things are working on the platform.
Dear @msweetlove, have you had time to look at the reviews? Do you know when you will have time to respond? Thanks, Julia
Email sent to author at naturalsciences.be address.
Email bounced. Tried an address @ vlaanderen.be
Dear @ginberg and @cpalmer718, Thank you very much for your reviews. We appreciate the time and care you put into them. The author has switched jobs and cannot continue with the development of this package and unfortunately there is no-one at his former workplace who can carry on with the work so the package development will be stopped. This means that we will close this review now. Please let me know how many hours you each spent reviewing (you can estimate if you can't exactly remember as it was quite some time ago) and we can log this in our reviewer database.
Thanks again for your help and input, Julia
@jooolia okay, that's too bad. I spend around 4h and @cpalmer718 seems to have spend 10h (see the review)
Thanks @ginberg! Yes somehow I missed both of your hours. :dizzy_face: thanks for pointing out that the info was already there!
@jooolia I'm sorry to hear it, but thanks for inviting me regardless. Yes, that was about the number of hours I spent, thanks @ginberg
Submitting Author Name: Maxime Sweetlove Submitting Author Github Handle: !--author1-->@msweetlove<!--end-author1-- Other Package Authors Github handles: @ymgan, @Antonarctica Repository: https://github.com/biodiversity-aq/OmicsMetaData Version submitted: 0.0.1 Submission type: Standard Editor: !--editor-->@jooolia<!--end-editor-- Reviewers: @ginberg, @cpalmer718
Due date for @ginberg: 2021-12-26 Due date for @cpalmer718: 2022-02-19 Archive: TBD Version accepted: TBD
Scope
Please indicate which category or categories from our package fit policies this package falls under:
Explain how and why the package falls under these categories (briefly, 1-2 sentences): data retrieval: The OmicsMetaData package allows users to download nucleotide sequences from INSDC alongside any associated metadata linked to the sequences, based on a BioProject identification number.
data munging: The OmicsMetaData package provides tools to format biodiversity 'omics metadata following the widely used data standards MIxS (for 'Omics data) and DarwinCore (for biodiversity data). Standardizing datasets makes them much more easily interoperable between researchers and institutions and across time, and makes it easier for users to archive the metadata alongside the sequences on the INSDC databases.
Who is the target audience and what are scientific applications of this package? Scientists that work with biodiversity 'omics datasets on a daily basis, and have a need to standardize the data to exchange it between colleagues or archive it online after the end of a project, or if they want to expand their dataset with online openly available sequence data.
Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category? At present I am not aware of any such packages.
(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research? Not applicable.
If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted. Not applicable.
Technical checks
Confirm each of the following by checking the box.
This package:
Publication options
[ ] Do you intend for this package to go on CRAN?
[ ] Do you intend for this package to go on Bioconductor?
[ ] Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:
MEE Options
- [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)Code of conduct