Closed cboettig closed 5 years ago
Thanks for your submission, @cboettig. I have run the editor's checks below. Please address the requested changes below. I have begun looking for reviewers and I will update this thread with the reviewers and due date once all reviewers have agreed to take this on.
Changes for package authors:
[![](https://badges.ropensci.org/269_status.svg)](https://github.com/ropensci/software-review/issues/269)
LICENSE
file should be filled out or removed.test-template.R:11: failure: template knows about internal classes too
`print\(template\("ResponsibleParty"\)\)` does not match "individualName: \\{\\}".
Actual value: "\{\}"
devtools::check(run_dont_test = TRUE)
I get the following error.E checking examples (619ms)
Running examples in 'emld-Ex.R' failed
The error most likely occurred in:
> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: eml_locate_schema
> ### Title: eml_locate_schema
> ### Aliases: eml_locate_schema
>
> ### ** Examples
>
> ## No test:
> f <- system.file("examples", "example-eml-2.1.1.xml", package = "EML")
> eml <- xml2::read_xml(f)
Error: '' does not exist in current working directory ('/tmp/Rtmp2NUD8n/emld.Rcheck').
For reviewers:
[ ] Please note whether you have any difficulties with the chain of dependencies (jsonld package -> V8 package -> V8 system libraries) that leads back to V8. This is at least a potential pain point.
[ ] Check code that is not covered by the test suite. See the output of goodpractice:gp()
and covr::package_coverage()
.
✖ write unit tests for all functions, and all package code in general. 86%
of code lines are covered by test cases.
Two reviewers have agreed. Thanks to Kelly and Peter for being willing to take on this review. :tada: Typically we ask reviewers to complete their review in 3 weeks, but in this case that includes several holidays and most people's winter break. So I have set the due date with that in mind. Kelly and Peter, in addition to the typical checks listed in the reviewers' guide, I've noted a couple of things to pay attention to in my editorial comments above.
Reviewers: @khondula and @gothub
Due date: 2019-01-04
Thanks @lmullen !
natural
and intuitive
and be a bit more precise:
This is a utility for transforming Ecological Metadata Language
(EML) files into JSON-LD and back into EML. Doing so creates a
list-based representation of EML in R, so that EML data can easily
be manipulated using standard R tools. This makes this package an
effective backend for other R-based tools working with EML. By
abstracting away the complexity of EML's XML Schema, developers can
build around native R list objects and not have to worry about satisfying
many of the additional constraints of set by the schema (such as element
ordering, which is handled automatically). Additionally, the JSON-LD
representation enables the use of developer-friendly JSON parsing and
serialization that may faciliate the use of EML in contexts outside of R,
as well as the informatics-friendly serializations such as RDF and
SPARQL queries.
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
). (see comments below)paper.md
matching JOSS's requirements with:The emld
package provides a handy set of functions that will facilitate manipulating and analyzing ecological metadata in R. The functions work as described, and the readme provides both a clear explanation of how emld
differs from packages like EML
, and ways to use the output of its functions in with SPARQL or JQ queries in rdflib
or jqr
respectively. Users of this package will likely be using it in combination within this ecosystem of related packages, however the translation functionality of this package should lower the barriers to entry for any user that is not familiar with any of EML, json, XML, or R lists.
The main issues I noticed were in the documentation, especially to correct the SPARQL query in the README/vignette so that a blank dataframe is not returned. Otherwise, I found the package easy to use and understand, and I look forward to seeing it published.
From review checklist:
as_emld()
.eml_locate_schema()
requires having the EML
package installed. Consider adding this package to Suggests
in package documentation. Suggestions for improving package documentation:
Suggestions for improving function documentation:
as_emld
, consider providing information about the return value of the function.template
, consider providing more options for the object argument may be helpful.template
, the Value text refers to usage in vignettes however I was unable to locate that.Suggestions for improving README:
as_json("hf205.json")
line after pipe in SPARQL queries section for readability<eml://ecoinformatics.org/eml-2.2.0/>
.maggritr
. Is this required?object_verb
format suggested in rOpenSci packaging guidelines, however it is consistent with other conversion-based functions. My only concern is that while the function names are not in conflict, they are similar to non-eml specific conversion functions jsonlite::toJSON
and xml2::as_xml_document
. The emld::as_xml
details clarify that the funciton is specific to EML and emld objects -- perhaps it would be useful to clarify this in the as_json
documentation as well? as_emld
, as_json
, and as_xml
to keep with rOpenSci packaging guidelines (1.6).?emld
in keeping with rOpenSci packaging guidelines (1.6).as_eml_document.R
, validate_units.R
, etc. internal functions that should have #' @noRd
as per rOpenSci packaging guidelines (1.6)?devtools::test()
. devtools::check()
I encountered the following notes:❯ checking package dependencies ... NOTE
Package suggested but not available for checking: ‘spelling’
❯ checking top-level files ... NOTE
Non-standard file/directory found at top level:
‘LICENSE.md’
goodpractice:gp()
similarly suggested that spelling
was not available, however this message disappeared after I installed the spelling
package. ✖ fix this R CMD check ERROR: Package suggested but not available: ‘spelling’ The suggested
packages are required for a complete check. Checking can be attempted without them by setting the
environment variable _R_CHECK_FORCE_SUGGESTS_ to a false value. See section ‘The DESCRIPTION file’ in
the ‘Writing R Extensions’ manual.
goodpractice::gp()
indicates that 88% of code has test coverage, and that there are a few dozen code lines longer than 80 characters. The output of covr::package_coverage()
indicates that all functions have greater than 75% test coverage. Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).paper.md
matching JOSS's requirements with:Estimated hours spent reviewing: 5-6 hours
The case made for the necessity of the emld
package in converting EML to JSON-LD is clear and compelling. Providing the ability to inspect EML derived data as S3 lists, JSON and via SPARQL queries will facilitate creation of new tools and workflows to automate access and interpretation.
In the Motivations
section of README.md
the purpose and method of "extending EML with other semantic vocabularies" isn't clear (to me). Consider adding a section such as "Extending EML" with an example, if this is within the scope of the package.
The help text for the as_xml
Description
contains only the text as_xml
and does not describe what the function does. Also, the return type is not specified. The same is true for both as_emld
and as_json
, template
.
There is no documentation index is available for the package.
There is no documentation for ?emld'
or ?emld-package
.
This may be outside the scope of the package, but no functions, strategies or examples are presented in the documentation for simple editing of EML data, i.e. (reading, simple edit, write out). This would be very useful, but if this functionality presents to much overlap with the EML
package, pleas disregard this comment.
The as_json
, as_emld
, as_xml
functions have a clear purpose and work as expected.
In addition to testing using the provided code samples, the following checks for a complex EML document were performed for a couple of EML documents such as https://goa.nceas.ucsb.edu/#view/urn:uuid:3249ada0-afe3-4dd6-875e-0f7928a4c171:
as_json
produced valid JSON-LDx <- as_emld('metadata.xml') # using
as_xml(x, "new.xml")
The following potentially unresolved items exist, as show from a quick scan of the source code:
$ grep FIXME *
as_emld.R: ## FIXME technically this assumes only our context
as_xml.R:## FIXME drop NAs too?
eml_validate.R: ## FIXME shouldn't have to write to tempfile,
eml_validate.R: # FIXME technically must be unique inside parent system only...
emld-methods.R:## FIXME: Print method should drop context
devtools::check() reports non-standard file LICENSE.md
Thank you, @khondula and @gothub for your thorough reviews.
@cboettig: Please let us know if you have any questions about the reviews. Will you be able to complete the necessary changes in two weeks time, i.e., by January 25?
@khondula @gothub @lmullen Thanks for these very helpful and detailed reviews! I believe I have addressed all of these issues now. I've made the edits in a separate branch and created a PR, https://github.com/cboettig/emld/pull/30 which I will leave un-merged for now, hopefully that provides an easy way to see the differences. I've also summarized my replies to each of your point-by-point reviews (with a few clarifying questions and comments) in the two issues linked in that PR. Anchors aweigh :ship: ⚓️!
Thanks for making these changes, @cboettig.
@khondula and @gothub: Could you please review these changes and make any further comments within one week, by Feb 2? If the package is now satisfactory, you can check of any remaining relevant boxes in your main review.
@lmullen @cboettig I had a look at https://github.com/cboettig/emld/issues/29 and it looks great!
This package is a really useful contribution, thx!
@cboettig re: the template
function - I think if I were more familiar with the EML package I would agree that it might fit there better, but it seems like a nice function to have since it might help a user get more familiar with eml if they are starting from scratch. The fxn documentation seems to still refer to it being used in the readme though ("see vignettes"), which might be confusing( unless I am missing a different vignette).
otherwise everything looks great to me!
@khondula @gothub Great! Thanks to you both for the constructive reviews. I've gone ahead and merged my own PR now.
@khondula
I rebuilt the vignette, should have no mention of the template()
function there now and I didn't see anything in searching for template
in any of the main user facing functions. It's still in the package as a bit of a vestigial organ for the time being, may either be flushed our or removed in favor of functions from EML
package.
I think that's everything on my end for the moment; if you're happy with the little paper.md
for the JOSS side? Over to you, @lmullen
Thank you @gothub and @khondula.
@cboettig We can consider this accepted now. Here are some post-review things for you to do.
@cboettig Some other things pertaining to the JOSS submission.
Once you've submitted the paper, please let me know the URL to the submission and I will confirm that it has been peer reviewed by rOpenSci.
Thanks @lmullen ! I've now completed the transfer to rOpenSci, and created the submission to JOSS at http://joss.theoj.org/papers/7e9064114c9a8d6756ef500cb5bc5d2b
@cboettig Thanks, Carl. Looks like everything is wrapped up. Glad to see everything went through JOSS just fine.
@stefaniebutland This new package/paper might be a good candidate for a blog post at rOpenSci.
Yes indeed @lmullen. Carl knows he has a wide open invitation to blog about his packages :-)
Summary
Transform Ecological Metadata Language (EML) files into JSON-LD and back into EML. Doing so creates a natural list-based representation of EML in R, so that EML data can easily be manipulated using standard R tools. This makes this package an effective backend for other tools for working with EML. Additioally, the JSON-LD representation enables the use of developer-friendly JSON parsing and informatics-friendly serializations such as RDF and SPARQL queries. This package is targeted primarily at developers. By abstracting away the complexity of EML's XML Schema, developers can build against intuitive list or JSON objects and not have to worry about satisfying many of the additional constraints of set by the schema (such as element ordering, which is handled automatically).
In practice,
emld
will basically be the backend machinery for a new release of theEML
package, which will retain largely the same user-facing API, with several improvements. But both in terms of ease of reviewing and ease of package maintenance, I think it's important to have these as separate packages.URL for the package (the development repository, not a stylized html page): https://github.com/cboettig/emld
Please indicate which category or categories from our package fit policies this package falls under *and why(? (e.g., data retrieval, reproducibility. If you are unsure, we suggest you make a pre-submission inquiry.):
data publication, data retrieval, and reproducibility.
Researchers and ecological informatics teams (at NEON, LTER sites, NCEAS, etc) who are consuming or managing EML and data described by EML.
This overlaps with the
EML
package, and will soon be the backend to the newer version of said package. This package works very differently than the current CRAN version of EML: instead of a massive S4 class system, this uses a single S3 class by exploiting the magic of JSON-LD.Requirements
Confirm each of the following by checking the box. This package:
Publication options
paper.md
matching JOSS's requirements with a high-level description in the package root or ininst/
.Detail
[x] Does
R CMD check
(ordevtools::check()
) succeed? Paste and describe any errors or warnings:[x] Does the package conform to rOpenSci packaging guidelines? Please describe any exceptions:
If this is a resubmission following rejection, please explain the change in circumstances:
If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names: