The current user interface, while better than anything else I've found, is still way too complex.
What happens now
A user provides a query string and chooses an API to search. The format of the query string differs between the APIs, which means users have to read different sets of documentation to be able to query them (although we do try to help by linking to those docs and providing some commonly-used examples). If a user re-runs a previously-run command, or runs a command that returns some results already included, the old results are overwritten and re-downloaded.
What should happen
The user can provide one or more queries to match any particular metadata field, multiple fields, or all fields.
They can apply the search to any of the APIs, or any combination of them, without having to change the query syntax.
Behind the scenes we standardise the interface across the different sources.
If the user requests an action that isn't available at one of the sources, we print a prominent, helpful warning before proceeding with the analysis.
We should globally (i.e. for each user) store an index of documents already retrieved. Any result that is already present on the machine should be retrieved this way rather than downloaded again.
There should be a way to override the behaviour described in the previous item (i.e. to force redownloading) to allow for getting updated items
Any query should not overwrite data in the output directory unless the users specifies that behaviour.
Users should be able to iteratively expand or refine their results
The current user interface, while better than anything else I've found, is still way too complex.
What happens now
A user provides a query string and chooses an API to search. The format of the query string differs between the APIs, which means users have to read different sets of documentation to be able to query them (although we do try to help by linking to those docs and providing some commonly-used examples). If a user re-runs a previously-run command, or runs a command that returns some results already included, the old results are overwritten and re-downloaded.
What should happen
Personas / routes
(tbc...)