DARMA-tasking / LB-analysis-framework

Analysis framework for exploring, testing, and comparing load balancing strategies
Other
3 stars 1 forks source link

Make output JSON files compatible with vt's OfflineLB #367

Closed nlslatt closed 1 year ago

nlslatt commented 1 year ago

The JSON files output by LBAF will be consumed by vt's LBDataRestartReader and replayed (in terms of object locations) by vt's OfflineLB. There are certain conditions that need to be met by LBAF's output files in order for the files to be acted upon as expected. I'll start with two common use cases before explaining more generally.

An input file containing only phase 0: LBAF should write the input phase 0 data verbatim into the output file, still labeled as phase 0. The phase 1 data written to the output file should contain loads from phase 0 but use the post-LB object locations.

An input file containing phases 0 and 1, where LBAF only load balances on phase 0: LBAF should write the input phase 0 data verbatim into the output file, still labeled as phase 0. The phase 1 data written to the output file should contain loads from the phase 1 inputs but use the post-LB object locations.

For cases where there are more phases of data in the input file, LBAF balances more than one phase, or LBAF starts load balancing not on phase 0:

Does that all sound correct to you @lifflander ?

lifflander commented 1 year ago

Yes, that sounds exactly correct to me

ppebay commented 1 year ago

@nlslatt now that the writer is doing what it's supposed to do per phase, I am getting to the execution logic you described.

However, regarding this requirement:

For cases where there are more phases of data in the input file, LBAF balances more than one phase, or LBAF starts load balancing not on phase 0:

All phases before and including the first one LBAF attempts to load balance should be copied verbatim from the input file into the output file.

I have the following question: the way LBAF is written is that it loads only phases whose IDs are contained in the list (possibly reduced to a singleton) specified by phase_ids in the configuration file. This is for efficiency's sake, especially in large JSON files where only a single phase is to be load-balanced (or a few phases are to be stepped-through -- and not necessarily consecutive ones). However, in order to achieve the desired specification above, all phases would have to be read, loaded into LBAF's internal (so they might be outputted later) before phase J, which would be very inefficient in general.

I therefore submit that this modus operandi should be optional, and probably turned off by default. Do you agree?

nlslatt commented 1 year ago

I therefore submit that this modus operandi should be optional, and probably turned off by default. Do you agree?

@ppebay That sounds reasonable.

ppebay commented 1 year ago

More on this @lifflander, @nlslatt

Currently, the JSON schema validator considers this rank entry as valid:

{"type":"LBDatafile","phases":[]}

(this is the case of rank 3 in the synthetic_lb_data example, which indeed has no objects assigned to it).

This is causing problems to support the stated goals of #367, because the empty phase is inconsistent in that it does not have an ID. I would argue that this is inconsistent anyway with the other ranks, e.g. 2:

{"type":"LBDatafile","phases":[{"id":0,"tasks":[{"entity":{"id":8,"home":2,"type":"object","migratable":true},"node":2,"resource":"cpu","time":1.5}],"communications":[{"type":"SendRecv","to":{"type":"object","id":6},"messages":1,"from":{"type":"object","id":8},"bytes":1.5}]}]}

because this gives the impression that rank 3 is in phase-less state.

My proposal would be to modify our schema, in order that in such a case (no tasks assigned to a rank), the JSON file for rank 3 instead contain the following:

{"type":"LBDatafile","phases":[{"id":0,"tasks":[]}]}

If my proposal is accepted, that would probably mean modifying the vt JSON writer, and assuredly the schema validator.

What do you think?

nlslatt commented 1 year ago

@ppebay @lifflander I see the need for a change here. My question is why {"type":"LBDatafile","phases":[{"id":0,"tasks":[]}]} as opposed to {"type":"LBDatafile","phases":[{"id":0}]} or {"type":"LBDatafile","phases":[{"id":0,"tasks":[],"communications":[]}]}?

ppebay commented 1 year ago

@lifflander @nlslatt

Good question, I don't have any strong opinion regarding the answer to it :)

As long as the phase id is in there, I am fine with it.

Maybe the second one is best thanks to its compactness?