flux-framework / flux-core

core services for the Flux resource management framework
GNU Lesser General Public License v3.0
168 stars 50 forks source link

need support for custom job output format #1693

Closed garlick closed 4 years ago

garlick commented 6 years ago

The tools that list active jobs should allow the output for each job to be customized, like

I like the dpkg-query design, where ${attribute-name[;field-width]} expressions are substituted.

Perhaps this support could be built as a reusable module with interface similar to:

/* Look up 'attr' and return its value.
 * Return NULL if value is unavailable.
 */
typedef const char * (*format_lookup_f)(const char *attr, void *arg);

/* Build a \0 terminated line of output in 'buf' based on 'format' string.
 * For each ${attr} macro encountered, call 'cb' to obtain value.
 * 'arg' is passed to 'cb' (presumably a dictionary of some sort).
 * Return number of bytes copied on success, -1 on failure with errno set
 */
int format_string (char *buf, int size, const char *format, format_lookup_f cb, void *arg);
grondo commented 6 years ago

We could perhaps look into some of the templating tools that might be able to take json directly and arbitrarily format the fields, instead of coming up with our own new api.

Also, if the tools that list active jobs are written in a higher-level language, it might be easier to just handle the formatting in those languages (e.g. python), rather than making python binding authors write to a less capable C api.

The key for this issue might be the underlying interface that allows a query for job information to be sent to the job manager and have the job manager return arbitrary attributes (perhaps even attributes that core doesn't know about, i.e. those provided by scheduler?)

garlick commented 6 years ago

Good insights!

I wasn't proposing this as a public API, more like a libutil helper for a C flux job list subcommand (rather than hardwiring something there). I agree this is a lot easier in a high level language though, so maybe not worth a big effort if the queue listing program ends up being in python or lua.

The key for this issue might be the underlying interface that allows a query for job information to be sent to the job manager and have the job manager return arbitrary attributes (perhaps even attributes that core doesn't know about, i.e. those provided by scheduler?)

Agreed! The protocol I was going to propose for the job manager was to send a list of attributes in a request and get back an array of dictionaries, where there is an array entry for each job, and the dictionaries contain values for the requested attributes. We will probably would want to put some API support in front of that, but we could leave the JSON encoding/decoding to the caller.

I like the idea of arbitrary attributes - maybe we should just directly map these attributes to keys in the KVS so that neither the tool nor the job manager needs to know what it's fetching.

grondo commented 6 years ago

Oh yeah, sorry. Reading back my comment I didn't mean it to sound so dismissive.

And, looking at the existing template definition libraries (e.g. mustache), they are really targeted to simple forms and don't allow format specifiers, so I'm not sure there is an existing templating system to do this for us.

SteVwonder commented 6 years ago

Strawman example based on python's formatting syntax that would allow for setting the fill, alignment and width on strings/ints and the precision on floats. I also found out in the python course that python classes can implement a __format__ method to override the default format behavior. The datetime class, for example, implements this __format__ method to accept the same format string as strftime and strptime.

flux queue -o "{id:>15} | {user:<10} | {priority:>10,d} | {submit_time:%Y/%m/%d-%H:%M}" could output:

             id | user      |  priority |      submit_time
----------------------------------------------------------
 1234-4567-2345 | stephen   | 1,000,000 | 2018/09/20-10:15
 6798-3874-1342 | bob       |   100,000 | 2018/10/01-14:30

The other nice part about using this syntax is that python exposes a parser for it, that would allow us to easily iterate over the attributes specified by the user and then request only those attributes.

garlick commented 6 years ago

That looks pretty nice!

I wonder if we should relegate flux job to a test directory and go with a python flux job? In addition to this nice formatting support, python also seems to have advantages in parsing/generating YAML jobspec.

Plus if we work on a nice Python API for submitting, listing, and monitoring jobs, it would be nice if a ready example for people wanting to programmatically do those things (e.g. for workflows) is the installed flux-job...

SteVwonder commented 6 years ago

In addition to this nice formatting support, python also seems to have advantages in parsing/generating YAML jobspec.

I am always for more Python :smile:

I agree that Python is a really good choice for the front-end CLI tools. In addition to the benefits you listed, there are lots of nice libraries for almost everything you might want to do, and the python packaging ecosystem makes dependencies relatively painless. Plus the performance requirements at the CLI-level should be minimal enough that Python won't be a bottleneck.

dongahn commented 6 years ago

I agree -- BTW, my recent experience with resource testing front-end command was that dealing with YAML jobspec was just a matter of a few methods from the yaml python module.