DBMS-Benchmarker is a Python-based application-level blackbox benchmark tool for Database Management Systems (DBMS). It connects to a given list of DBMS (via JDBC) and runs a given list of parametrized and randomized (SQL) benchmark queries. Evaluations are available via a Python interface and on an interactive multi-dimensional dashboard.
Hi, I'm Erik Whiting, I'm one of the reviewers for your submission to the Journal of Open Source Software (JOSS). The review is being coordinated here
This issue is to note my review of the paper portion of your submission. For your convenience, I have copy and pasted the criteria set forth by the JOSS editors and my notes on why I think your paper does or does not meet the requirements. Please note, this is for the paper review portion only (there is also "General Checks," "Functionality," and "Documentation" reviews for which I will open separate issues).
The quoted parts are the criteria set forth by JOSS and the text under is why I think your paper does or does not meet the criteria at this time.
Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
Yes, the summary is clear enough that a non-specialist could read it and have at least a good idea about what you're trying to do with this software.
A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
While you do state what the software is designed to solve, I don’t see mention of the target audience here or anywhere else really. Can you elaborate on what kind of person uses this software (e.g., developers, DBAs, QA engineers)?
There are two other things about this section of the paper I think need addressing:
You said “we want to use Python as the common data science language” but who is “we”? Does “we” mean the authors here? If so, then is the “target audience” people like the authors? Or does “we” mean everyone involved in DBMS benchmarking research? If so, I’d disagree that everyone wants to use Python as the common data science language. It might be a majority of people but I’m sure there are people out there who would rather use R or SASS as the “common data science language.” Can you either specify who you mean by “we” here or provide a citation that indicates a general call to use Python as the common data science language?
I feel like the statement “there is a need for a tool to support the repetition and reproducibility of benchmarking situations” (line 26) needs a citation. I don’t disagree with you, but I don’t think we should just say “there’s a need for this” without providing some examples or proof of such a need.
State of the field: Do the authors describe how this software compares to other commonly-used packages?
This part is stated as clearly as it can be. I also concur, I don't know of any DBMS benchmarking tool that facilitates reproducible statistical analysis.
Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
Personally, I don't believe in critiquing grammar when conducting a review, so I won't. However, I want to ask if you can make the paragraph starting at line 26 a little more clear. I think the way the citations are formatted is making it hard to understand exactly what you're saying.
References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?
Citations look good to me but I did ask a question here. I might be misunderstanding the meaning but the usage of "cf" is new to me.
Hi @erik-whiting , thank you for your valuable hints!
I just revised the section statement of need and incorporated your suggestions for improvement.
I will push a new version.
Hi, I'm Erik Whiting, I'm one of the reviewers for your submission to the Journal of Open Source Software (JOSS). The review is being coordinated here
This issue is to note my review of the paper portion of your submission. For your convenience, I have copy and pasted the criteria set forth by the JOSS editors and my notes on why I think your paper does or does not meet the requirements. Please note, this is for the paper review portion only (there is also "General Checks," "Functionality," and "Documentation" reviews for which I will open separate issues).
The quoted parts are the criteria set forth by JOSS and the text under is why I think your paper does or does not meet the criteria at this time.
Yes, the summary is clear enough that a non-specialist could read it and have at least a good idea about what you're trying to do with this software.
While you do state what the software is designed to solve, I don’t see mention of the target audience here or anywhere else really. Can you elaborate on what kind of person uses this software (e.g., developers, DBAs, QA engineers)?
There are two other things about this section of the paper I think need addressing:
You said “we want to use Python as the common data science language” but who is “we”? Does “we” mean the authors here? If so, then is the “target audience” people like the authors? Or does “we” mean everyone involved in DBMS benchmarking research? If so, I’d disagree that everyone wants to use Python as the common data science language. It might be a majority of people but I’m sure there are people out there who would rather use R or SASS as the “common data science language.” Can you either specify who you mean by “we” here or provide a citation that indicates a general call to use Python as the common data science language?
I feel like the statement “there is a need for a tool to support the repetition and reproducibility of benchmarking situations” (line 26) needs a citation. I don’t disagree with you, but I don’t think we should just say “there’s a need for this” without providing some examples or proof of such a need.
This part is stated as clearly as it can be. I also concur, I don't know of any DBMS benchmarking tool that facilitates reproducible statistical analysis.
Personally, I don't believe in critiquing grammar when conducting a review, so I won't. However, I want to ask if you can make the paragraph starting at line 26 a little more clear. I think the way the citations are formatted is making it hard to understand exactly what you're saying.
Citations look good to me but I did ask a question here. I might be misunderstanding the meaning but the usage of "cf" is new to me.