vsoch / gridtest

grid parameters and testing for Python modules and functions
https://vsoch.github.io/gridtest/
Mozilla Public License 2.0
2 stars 1 forks source link

Clarification request: Who is this software for? #28

Closed khinsen closed 4 years ago

khinsen commented 4 years ago

Background: I am looking at gridtest as a reviewer for JOSS (see here). A question I have been asking myself in this context is: what is the developer or user profile this package is aimed at? I haven't found the answer yet, and the section "Who is this software for?" in the README doesn't answer it either.

gridtest seems to be

  1. a lightweight testing framework for Python developers, with a focus on parameter scans
  2. a command-line tool to generate input parameter scans as JSON files, for use in testing or evaluation software that is not necessarily written in Python
  3. a Python library for generating grids of input parameters

The documentation is clearly written for Python programmers, which seems to exclude scenario 2, although other parts of the documentation refer to it. But the main problem I see is that the documentation mixes the three scenarios to the point that for most potential users, the first impression is "not sure this is really for me". And for me as a reviewer, that means I don't really know into whose shoes I am supposed to put myself.

vsoch commented 4 years ago

It’s a very generic library, so the user group can generally be anyone that wants to package reproducible grids with their software (a research scientist) to someone that wants to use gridtest for testing (an RSE or research software engineer!)

khinsen commented 4 years ago

Suggestion: remove the section "Who is this software for?" from the README, if the only possible answer is "it's for the people who have a use for this software".

vsoch commented 4 years ago

I think it's good to have it clearly stated that it's not foremost a testing library, and I think that the section is useful to have, upfront. Could we perhaps work together to come up with a a more specific paragraph than "It's for the people who have a user for the software?" Here is a starting suggestion:

Gridtest is intended for definition and saving of grids for any need that you might have, or even for quick generation of running tests. It is not intended to be a robust testing library like pytest or even unittest, but rather a quick way to generate reproducible grids for scientific analyses.

khinsen commented 4 years ago

I'd be happy to help with this, but I really have no idea of which user profile(s) gridtest targets. That's why I opened this issue!

Consider the "testing" use case. Most Python developers would go for the built-in unittest, and then perhaps complement it with pytest or nose. If they consider this framework too heavy, they'd write testing scripts of their own. So why should they use gridtest instead? It has some advantages, of course, such as tests-as-yaml and of course support for grids. That has to weighed against the cost of "one more dependency" (compared to home-grown scripts) or of "different from mainstream" (compared to unittest and friends). What are the uses cases where gridtest represents the sweet spot? That's what needs to be in the README, because few people will read through the documentation to discover potential advantages of gridtest themselves.

BTW, speaking of grids, they are mentioned as a main selling argument, but no detail is given. When I first read "grid", I thought of numpy.arange(...) along multiple dimensions. I had to delve deep into the documentation to discover that gridtest's concept of a grid is more flexible, and I still don't know how flexible it really is, even after having read most of the documentation.

vsoch commented 4 years ago

I think perhaps a better solution would be to remove the extra docs from the README and direct the user to the rendered pages? The README I kept because I thought it was helpful but it seems to have had a negative experience for you. What do you think?

vsoch commented 4 years ago

Or maybe we could do a brief list of links that take people to the meaty sections of rendered documentation that are most helpful?

vsoch commented 4 years ago

Personally, I wouldn't jump to use it in a robust testing sense for a complex software library - I also really like pytest for that. If I was still a research scientist and I had a small script I might use it for testing, however, because it would help me to quickly generate some basic tests. Generally though I'm attracted to the reproducibility side of things, so I think of gridtest as a generator tool that would let me package reproducible grids that might also allow for measurement of metrics. For example, I might create a set of grids to be used to generate inputs for machine learning, and then anyone can edit a particular function that trains / tests a model, and then gridtest can collect metrics like time to train, sensitivity and specificity, etc. I can see having repos of just grids for people to grab and plug their functions into different generator contexts. Yes, there are definitely other ways to do this, as there are many ways to skin a cat, but this is a different approach that starts to get people thinking about separating the parameter grids or metrics collection from some primary set of functions.

I'm not claiming this to be earth shattering software, but I think there is place for it in the research software universe, even if a much smaller niche than some other python library. As a developer the generality of the tool is exactly what excites me, because I'm wanting to wait and see what creative use cases people come up with! That includes myself - I do so much with testing and yaml and definitely see this being useful down the line.

khinsen commented 4 years ago

@vsoch Lots of good ideas there! A README is always good to have, especially on GitHub where it figures so prominently. What I expect to find there is 1) a description of the project that lets me quickly see if it's worth (for me!) to have a closer look and 2) information on getting started, either in-line or as pointers to the main documentation. My first impression from the current README is "this is a piece of software that does many exciting things, but you have to figure them out for yourself".

I'd summarize your last comment as two use case scenarios:

I hope I got that right - but it certainly is the kind of information I was looking for!

vsoch commented 4 years ago

Gotcha! I'll put in a PR for your review to fix this up!

vsoch commented 4 years ago

All set - I've updated the README and have a PR for your review: https://github.com/vsoch/gridtest/pull/30

vsoch commented 4 years ago

Closed with #30