kernelci / kernelci-api

KernelCI API - Database - Pub/Sub
GNU Lesser General Public License v2.1
10 stars 16 forks source link

User-friendly operators #356

Open gctucker opened 10 months ago

gctucker commented 10 months ago

The current syntax for operators on the command line is the same as in the API URLs, with a __ separator between the attribute name and an operator. For example, to count the number of kernel revisions (checkout nodes) since the beginning of the Early Access phase:

$ ./kci node count created__gt=2023-09-04 name=checkout
38

This works, however there are a few things that could be improved:

Combining operators doesn't work

This should return the number of nodes for the 1-day period, but instead it only takes into account the 2nd query parameter:

$ kci node count created__gt=2023-09-04 created__lt=2023-09-05 name=checkout
166

Deal with attributes that have __ in their name

Some test suites might well have __ in their name or particular jobs might generate custom data including it in their attributes. While we could make it invalid and the API would be rejecting such data, we could also look for other ways to encode the operators in the URLs. At the very least, we could just make it invalid to name attributes that end with an operator name. For example, foo_bar would still be valid, and even foo__bar__baz, but not foo_gte or foo_bar_lt as gte and lt as known operators.

Consider additional operators

The current 4 operators were added out of necessity for the initial use-cases we had. A fully-fledged system would need to be able to deal with a broader range of use cases and as such more operators are likely to be required. See the list of MongoDB operators as a reference, they could all be implemented very easily since internally the API uses MongoDB (at least the comparison ones).

Probably we could already just add the ne (not equal) operator to the list, in fact it's quite surprising we got this far without needing it.

Command-line syntax

While some improvements can be made on the API side to provide features and define any constraints with attribute names, the CLI syntax offered by kci could also be made simpler. Right now, it's quite easy to forget one underscore and doing created_lt=2024 will not match anything as that's looking for an attribute called created_lt.

If we wanted to keep it as-is and just pass the search parameters to the URL, we could have a check in kci if the attribute name ends with a known operator and print a warning to the user.

On top of this, we could also provide some symbols such as <, <=, == etc. as equivalents to the operator acronyms. A sample command line would then look like this:

$ kci node count "created > 2023-09-04" "created <= 2023-09-05" name=checkout

Please note that this would involve lots of characters that aren't very usable on the command line: < and > are used for redirecting streams, ! is a magic character in most shells to do things like running a previous command and = is used by parsers for arguments that take a value. So while making the queries more human-readable on the CLI is a good idea, it doesn't seem obviously better to just use these symbols. One option would be to create an interactive parser e.g. kci shell and then any characters could be used without interfering the main OS shell.

nuclearcat commented 10 months ago

I sent basic PR that will address at least undocumented features and make them more easy to document. Also, i might follow up later with PR to add ne operator, as it might be useful for example to search nodes that have owner not "admin", for example. And to handle remaining issues, i think we will need serious rework of query parameters on both cli and api side.