pnnl / lamellar-runtime

Lamellar is an asynchronous tasking runtime for HPC systems developed in RUST
Other
43 stars 5 forks source link

Data structure design/implementation help #7

Closed wjhorne closed 2 years ago

wjhorne commented 2 years ago

Hi,

I am trying to implement a distributed data structure using Lamellar that has some desirable properties for a problem I am working on and have hit a bit of a wall. I just wanted to check in and see if something like it was possible with what Lamellar has or plans to have in the future. No worries at all if it is out of scope or otherwise.

Essentially I would like to have a distributed array of Vec<(i64,f64)> where pes can grab/put Vec<(i64,f64)> values from/to other pes. It is also highly desirable to pass (i64,f64) values to be pushed to a specific Vec<> within the array, potentially on other processors, I was able to do this with upcxx in another code base using serializing and remote procedure calls (rpc), but was unsure of how to mimic it with Lamellar.

I have attempted to use active messages to do this, but I typically have gotten stuck attempting to pass (i64,f64) to be pushed to a given Vec in the distributed array. This is likely just due to my unfamiliarity with active messages vs rpc.

Best

rdfriese commented 2 years ago

Hi,

Thanks for asking! Is it more appropriate to think of this data structure as something like a matrix, where each Vec<(i64,f64)> is a fixed size? or is it the case that the Vecs are dynamic? (I think its this second case given that you mention wanting to push onto a given remote vector)

While my first though is that it would be nice to do this using the LamellarArray interface it unfortunately would not supports Vec as an element type (we would like to expose an N dimensional array abstraction, although that still probably wont work for you if you need dynamic sizing).

My Second thought is to instead us a Darc (Distributed Atomic Reference counted pointer) which allows us to pass around pointers to local data on each PE, while ensuring that the lifetime of the data is valid for as long as any handle to it exists. I've come up with a quick example for how I might implement this data structure, in the attached file. vec_of_vecs.rs.txt

Essentially what I do is create a LocalRwDarc (think Darc<RwLock<...>> which points to a Vec<Vec<(i64,f64)>>. I then create some Enums which define operations I want to perform. I differentiate between VecOps (i.e. get, put, ...) which operate on entire vectors on the remote pes, an ElemOps(i.e. push, pop, insert,...) with operate on individual elements within a given vector.

I then create two active messages where one contains a VecOp (and the LocalRwDarc handle) and the other contains a ElemOp (and the LocalRwDarc).

I am not sure if this is what you had in mind so please let me know and I will be happy to iterate back and forth!

wjhorne commented 2 years ago

It is akin to a sparse matrix where it also needs to be dynamically created/adjusted with time. To makes things more complicated, the information to fill in a row might potentially be distributed to another pe.

I'll take a look at the vec_of_vecs example you have put together and spend some more time digging into the Darc capability. Thank you for the help.

rdfriese commented 2 years ago

Ahh makes sense, I think there would be quite a bit of interest around getting higher level sparse data structure support, and is something we have in the back of our minds but probably a little ways off.

wjhorne commented 2 years ago

I was able to put something together that I believe will work. Thanks for the example!