flux-framework / flux-sched

Fluxion Graph-based Scheduler
GNU Lesser General Public License v3.0
86 stars 40 forks source link

Jobspecs with flexible resource types #1259

Open zekemorton opened 1 month ago

zekemorton commented 1 month ago

This features would allow jobespecs to specify multiple possible resource configurations and allow the scheduler to select configurations based what's available or what is best.

I've tested implementing this feature in the traverser by adding a new slot type, an or_slot with the intention here being there are multiple possible configurations and the scheduler can select any combination based on what's available.

Implementing a prototype of this in the traverser consisted of adding handling for the or_slot much like how slot is handled in dfu_impl_t::match and dfu_impl_t::test except by allowing for multiple resource types of or_slot to be specified. I also added a new function dfu_impl_t::dom_or_slot that behaves similarly to dfu_impl_t::dom_slot. A few key differences are that it will traverse the rest of the resource graph and job spec on the union of all resources specified under all of the or_slots. This is to ensure that we get proper counts. Then determine the best configurations of the or_slot options. Then create the edge groups for those or slot options. You can find the branch of this prototype here: https://github.com/zekemorton/flux-sched/tree/resource-or. The lates commit shows an example where the optimal configuration is selected using dynamic programing, but easier commits show how it was implemented with greedy selection.

This leaves me with some open questions that I would like to discuss with the larger group: Is implementing this in the traverser an appropriate place for this kind of functionality? What are some other options? What are kinds of logical operators for this kind of flexibility? What are different means that we can select the desired configuration? Will this be a whole new set of policies?

zekemorton commented 1 month ago

An example job spec with or_slots :

resources:
  - type: or_slot
    count: 8
    label: default
    with:
      - type: core
        count: 12
  - type: or_slot
    count: 8
    label: default
    with:
      - type: core
        count: 6
      - type: gpu
        count: 1

When running a match allocate on this against tiny.graphml we get:

m allocate t/data/resource/jobspecs/basics/test004.yaml 
      ---------------core0[1:x]
      ---------------core1[1:x]
      ---------------core2[1:x]
      ---------------core3[1:x]
      ---------------core4[1:x]
      ---------------core5[1:x]
      ---------------core6[1:x]
      ---------------core7[1:x]
      ---------------core8[1:x]
      ---------------core9[1:x]
      ---------------core10[1:x]
      ---------------core11[1:x]
      ---------------core12[1:x]
      ---------------core13[1:x]
      ---------------core14[1:x]
      ---------------core15[1:x]
      ---------------core16[1:x]
      ---------------core17[1:x]
      ---------------gpu0[1:x]
      ------------socket0[1:s]
      ---------------core18[1:x]
      ---------------core19[1:x]
      ---------------core20[1:x]
      ---------------core21[1:x]
      ---------------core22[1:x]
      ---------------core23[1:x]
      ---------------core24[1:x]
      ---------------core25[1:x]
      ---------------core26[1:x]
      ---------------core27[1:x]
      ---------------core28[1:x]
      ---------------core29[1:x]
      ---------------core30[1:x]
      ---------------core31[1:x]
      ---------------core32[1:x]
      ---------------core33[1:x]
      ---------------core34[1:x]
      ---------------core35[1:x]
      ---------------gpu1[1:x]
      ------------socket1[1:s]
      ---------node0[1:s]
      ---------------core0[1:x]
      ---------------core1[1:x]
      ---------------core2[1:x]
      ---------------core3[1:x]
      ---------------core4[1:x]
      ---------------core5[1:x]
      ---------------core6[1:x]
      ---------------core7[1:x]
      ---------------core8[1:x]
      ---------------core9[1:x]
      ---------------core10[1:x]
      ---------------core11[1:x]
      ---------------core12[1:x]
      ---------------core13[1:x]
      ---------------core14[1:x]
      ---------------core15[1:x]
      ---------------core16[1:x]
      ---------------core17[1:x]
      ---------------gpu0[1:x]
      ------------socket0[1:s]
      ---------------core18[1:x]
      ---------------core19[1:x]
      ---------------core20[1:x]
      ---------------core21[1:x]
      ---------------core22[1:x]
      ---------------core23[1:x]
      ---------------core24[1:x]
      ---------------core25[1:x]
      ---------------core26[1:x]
      ---------------core27[1:x]
      ---------------core28[1:x]
      ---------------core29[1:x]
      ---------------core30[1:x]
      ---------------core31[1:x]
      ---------------core32[1:x]
      ---------------core33[1:x]
      ---------------core34[1:x]
      ---------------core35[1:x]
      ---------------gpu1[1:x]
      ------------socket1[1:s]
      ---------node1[1:s]
      ------rack0[1:s]
      ---tiny0[1:s]
INFO: =============================
INFO: JOBID=1
INFO: RESOURCES=ALLOCATED
INFO: SCHEDULED AT=Now
INFO: =============================