Open mtwest2718 opened 1 year ago
Yes of course! We would love to have that contribution. The table was added recently but is a bit old (it was previously in a PDF that we dug up) and it's likely just an oversight that it's not there.
Also, are there a bit more detail on each of the categories, so I could suggest how to fill in each?
Let us know which categories you would like clarification on and we can do our best! And for some that are a bit opaque we can definitely add a note to that page.
Let us know which categories you would like clarification on and we can do our best! And for some that are a bit opaque we can definitely add a note to that page.
I am just starting with the multi-user mode piece by piece and TBH, I am having a hard time parsing any of the terms. I can guess what you mean but also have suspicions that the meanings may be very specific.
If you have specific questions please post them here and we would be happy to clarify any points.
Pinging @grondo and @garlick but I'll do my best to give these a first shot.
Multi-user workload management
Multi-user vs. single-user is exactly what it sounds like - akin to Nix Flux can be installed to serve an entire cluster of users, OR it can be run and controlled by one user, in say, a Docker container. On a multi-user instance a single user can also spin up a flux instance that they own entirely. HTCondor is definitely multi-user, and I am not sure about single.
Full hierarchical resource management
This means the scheduler understands its resources as a graph from the top level node down to a core or socket - this is a no for HTCondor.
Graph-based advanced resource management
It's more than workflow parent-child dependencies - this video gives a good visual: https://youtu.be/YIwt51dyXOE and flux-sched https://github.com/flux-framework/flux-sched. This is probably a no for HTCondor but others can chime in.
Scheduling specialization
I'm not totally sure on this one - I'll ask my colleagues! But I think this generally means you can customize policies and the algorithm, e.g.,:
sched-fluxion-qmanager, which manages one or more prioritized job queues with configurable queuing policies (fcfs, easy, conservative, or hybrid). sched-fluxion-resource, which matches resource requests to available resources using Fluxion's graph-based matching algorithm.
Security: only a small isolated layer running in privileged mode for tighter security
https://flux-framework.readthedocs.io/en/latest/guides/admin-guide.html?h=security#security
And I'll refer to my colleagues.
I would imagine most projects would dispute this.
Why?
Modern command-line interface (cli) design
We have a design that is more similar to what you might see for a Go / Python / Rust command line clients, e.g.,:
$ flux <options> <subcommand>
E.g., flux submit
or flux resource list
. This is in comparison to, for example, slurm that has single / separate binaries for each command (srun
squeue
etc).
Application programming interface (APIs) for job management, job monitoring, resource monitoring, low-level messaging
We could break into categories, but for now they are grouped. I think low level messaging is referring to https://flux-framework.readthedocs.io/projects/flux-rfc/en/latest/spec_3.html.
Language bindings Why isn't bindings beyond C/C++ sufficient for green?
Sorry, there is more then C/C++, the list has:
C, C++, Python, Lua, Rust, Julia, REST (and we also have Go under development)
Bulk job submission
This means submitting jobs in bulk.
High-speed streaming job submission
I know this means what it says - submitting thousands (millions?) of jobs quickly - I'm not sure about how it's implemented.
Hi All,
Someone on CNCF Slack channel pointed me at this page and I was disappointed to see HTCondor not included. Can I make a request it be added.
Also, are there a bit more detail on each of the categories, so I could suggest how to fill in each?
Cheers, Matt