envelope-project / laik

Other
9 stars 8 forks source link

Check the malleability API #174

Open amir-raoofy opened 1 year ago

amir-raoofy commented 1 year ago

Comments of @weidendo:

look at jac1d.c at some point and check if the malleability API makes any sense? I feel this is more complex than it needs to be.

Part of the complexity is that all LAIK objects (spaces, containers) are re-created in each LAIK process redundantly, and we need to do this initialization before data can be moved around. For that reason, moving data to joining processes is done explicitly via code, and not automatically as part of LAIK_Init.

We really should allow for LAIK objects to be created just once (e.g. in master) and synced automatically to other processes, with access to existing objects via names. Some objects obviously need to be versioned with the epoch = will change on malleability actions. Perhaps we should do this for all objects, e.g. to enable index space changes? This needs serialization/deserialization of objects to work, as well as KV syncs...

==============================

The current API looks reasonable to me (using laik_group_parent/ to get the objects and laik_set_iteration to set a global progression state) - I actually think that it is not the end of the world if a user needs to repeat partitioning steps in creating the objects for the newcomers (it is just a boilerplate code). It might even be somehow intuitive from the user's perspective to understand what is happening to the newcomer and its partitioning, but I also agree this is a bit complex and too verbose for a user!

Maybe one question though, is it called a phase or iteration in the end the names seem inconsistent to me: laik_set_iteration vs get_phase

So with KV, newcomers would not need to manually query the phase and perform the distribution. This means that upon a call to create an object, i.e., laiknew, we would return a , and one would do a lookup to get the object. Upon shrink and expand, for a newcomer, instead of manually running the space creation or partitioner code, one would do a lookup in the KV to get the (old) objects. Yes, I also find this much neater. I am unsure about the implications (object serialization and complex runtime synchronizations).