pr-omethe-us / pyked-papers

Various papers published about PyKED
Creative Commons Attribution 4.0 International
0 stars 0 forks source link

IJCK reviewer 2 comments #3

Closed kyleniemeyer closed 6 years ago

kyleniemeyer commented 6 years ago

My main concern with the creation of a publically available database, formatted with ChemKED, is that it seems that the creation of the files has to be done manually, i.e. susceptible for errors. This is especially true if the experiment comprises a big set of datapoints and/or hundreds of measured species. Furthermore, the YAML data serialization format is indeed human-readable, but it does not provide an easy summary/overview of the experimental data, such as an excel or CSV file would give, and therefore, people might be not willing to put the extra effort.

Abstract and introduction

In my opinion the YAML format resembles the XML format strongly. XML format is human-readable, however, because of the html-tags and ‘<>’ characters, the YAML format might be easier to read. If XML would lack human readability as it is stated in the article, it would not be such a popular format, and I don’t think PrIMe and ReSpecTh would be using it.

Overview of ChemKED format

PyKED architecture

Usage examples

RCM modelling with varying reactor volume

Shock tube modelling with constant volume

  1. Python code snippet on page 13:
    • [x] a. Add more commentary text.
    • [x] b. Changing the example to enable multithreading might make it more appealing to start using ChemKED as a data format in addition with Cantera as a simulation tool.

Conclusions and future work

kyleniemeyer commented 6 years ago

OK, there's a lot here... helpfully, they numbered the comments for us!

Beginning a few thoughts:

  1. Any ideas for a figure? I agree that it would be nice to have one, though we do have a lot of code snippets and such. Maybe we could add plotting of calculated ignition delays to the example at the end, and actually show such a figure? Or, we could include some sort of workflow diagram? The first half of the paper is probably what needs one the most, just to break up text, but I don't have any good ideas.

  2. We do have the converters, but we don't (yet) have anything that helps create files from scratch... perhaps we should? The GUI work can be ongoing, but just something that prompts for entries or something (with the ability to specify a CSV file for volume history, e.g.).

  3. I can add mention of the future plans to the Conclusions section, and we should also add that to the roadmap here... wow this reviewer is really looking into everything!

  4. Both reviewers seem to be taking a very literal definition of "human-readable", in that XML is written in ASCII rather than binary... however this one does agree that YAML is more readable, which of course is our point. I think we can emphasize that we mean understandable rather than technically readable, and that in relative terms YAML is better here.

  5. I agree with the reviewer here—reference shouldn't be required. We already started discussing some related topics in https://github.com/pr-omethe-us/PyKED/issues/69 and privately; my suggestion is that we drop it as a required field, but validate it as we do now if given. I've started an issue on that here: https://github.com/pr-omethe-us/PyKED/issues/76

  6. Do we need both atomic-composition and elemental-composition, or could we just pick one?

  7. Regarding two composition blocks inside common-properties, I think this would be possible, as long as field are given different YAML tags. Perhaps we should add a test for this.

  8. While I agree that submitted PRs on GitHub is less efficient than just uploading files willy-nilly, I think that there still needs to be some sort of human curation involved. However, one major benefit is that we have a tool that checks for basic errors, and also the submission happens in the open.

  9. We've already talked about the less-than-useful error messages that Cerberus gives; I'm not sure if there's anything we can do here other than address the issue and say we plan to improve these in the future.

  10. We already somewhat-address backwards compatibility by putting ChemKED versions in the files, but don't actually do anything with this yet... right? Also, the older versions of PyKED are all archived and still available.

  11. I agree that we should be consistent about using hyphen, underscore, or nothing in fields with multiple words. I like hyphens.

  12. I'm not sure that we gain any readability by combining volume_history and the compressed conditions, other than grouping RCM-specific stuff. Any thoughts?

  13. Online lookup could slow things down a bit, if that's important—perhaps we should mention the "hidden" disable_validation feature we added, when performance is important? Also, we can mention that our validation steps that require online lookup fail gracefully with no internet.

  14. I think there are some valid use cases for creating another file, for example when you have manipulated a ChemKED object and want to write it. However, what do think about this idea of printing summary statistics?

23--26 I should be able to fix.

  1. (b) I agree we could improve the example here, or point to the additional examples in the documentation.
kyleniemeyer commented 6 years ago

Ohh, I think for

  1. I think it would be more clear if volume_history, compression_time, compressed_temperature, and compressed_pressure would be combined. The line: ‘if the ChemKED file encodes an RCM experiment’ is unnecessarily repeated.

the reviewer meant just in the text, since those are all conditional on the file being for an RCM experiment—I don't think it's a suggestion on combining in ChemKED itself.

Maybe we can separate the RCM-specific stuff into a separate list, with some explanatory text? Otherwise we could lump into a single bullet, but I don't think that improves clarity.

kyleniemeyer commented 6 years ago

Reviewer #2 done!