Added PVector IO - Githubissues

zjwegert commented 9 months ago

A small PR to add PVector IO. @JordiManyer, let me know if you're happy with this or want to adjust.

JordiManyer commented 9 months ago

Is there a reason we are using CSV instead of an HDF5-compatible format like JLD2? With the later we might be able to save both the array itself and the inidices, which would allow for the read to be bang-less. Would we prefer that?

zjwegert commented 9 months ago

I hadn't considered going with JLD2 (only used it a few times), I think this might be a nice solution. I assume that we'd still have to generate several files?

JordiManyer commented 9 months ago

I assume that we'd still have to generate several files?

Yes, the MPI communicator cannot be saved, I'm afraid.

I've also been thinking we might not want to save the indices... Not because we cannot, but rather because a lot of times we are using the === to check if indices are the same. I am not sure the object id is the same if you save then read multiple copies of it, even in binary format. I have to think about it.

In any case HDF5 might still be a better format for saving/loading data fast.

zjwegert commented 9 months ago

@JordiManyer, I've adjusted to use JLD2.jl and added load_object! to copy file data to a PVector or PSparseMatrix. I haven't had a chance to test PSparseMatrix, but functions as expected for PVector.

Let me know what you decide with indices etc.

JordiManyer commented 9 months ago

@zjwegert I've kinda started implemented what I would imagine the IO should look like. The idea is that the functions psave, pload and pload! are totally generic, and one can implement specific versions of to_local_storage and from_local_storage for any distributed structure we might want to save.

Two main benefits of this approach:

It is robust with respect with of loading into PVectors that have a different dof layout. If the ghosts don´t match (or even if the owned dofs are reordered), everything should work just fine.
It potentially allows for more complex structures to be saved, such as FESpaces, FEFunctions, meshes, etc...

This is also general enough that it could potentially be moved to GridapDistributed and even PartitionedArrays.

The interface is based on the FileIO.jl one, with a p in front (which is similar to PartitionedArrays's naming scheme). I prefer not to overwrite any specific package's API, since we are going to change some of the signatures. Also, this would allow some dispatching on the file extension (like FileIO.jl does), so that the user can choose the file format he wants. For instance, I believe other formats like BSON/JSON can easily extended from what we have. We might even be able to leverage the machinery currently available on FileIO.jl. I'll have to think about it.

zjwegert commented 9 months ago

Awesome @JordiManyer, I think this would be a great addition to GridapDistributed and PartitionedArrays.

zjwegert commented 9 months ago

I've added a serial test as well. I think we should be able to do this as well. Not sure how it fits in with the current implementation.

JordiManyer commented 9 months ago

I think in serial we just rely on the regular functions... It's a tiny change, I don't think it matters too much. Also in serial you don't have ranks...

zjwegert commented 9 months ago

I think in serial we just rely on the regular functions... It's a tiny change, I don't think it matters too much. Also in serial you don't have ranks...

Agreed and also agree that serial doesn't need ranks of course. My point was more about people having to switch between psave/pload/pload! and functions save/load/load!. I don't think it's a big issue.

Also, I haven't read up on FileIO.jl but save and load seem to require some extra care.

JordiManyer commented 9 months ago

I believe they work with Dict{String,Any}, which is a bit annoying... I wish they worked for structs, but I guess JDL2 has it's own function for that.

zjwegert commented 9 months ago

Yeah... We could just use save_object and load_object for a serial implementation, and implement a load_object!. Then we could rename psave, pload, and pload! to save_pobject, load_pobject, and load_pobject!.

JordiManyer commented 9 months ago

If we could also remove FileIO from our dependencies and just create our own save and load functions, that default to JDL2. Then if someone wants to use other libraries they just explicitly call XXX.save

zjwegert commented 9 months ago

FileIO actually isn't in our dependencies JDL2 actually exports save and load from FileIO... Anyway, that is fine, I think we can just do as you say and call XXX.save etc.

JordiManyer commented 9 months ago

Mmm ok... Anyway, I would say both options are fine.

zjwegert commented 9 months ago

@JordiManyer, see that last commit. I'll add some documentation and then merge if you're happy.

JordiManyer commented 9 months ago

Does the LSTO_Distributed.XXX compile? I don't think it's needed (it's implied) and I'm not sure it compiles (because it doesn't know what that is, since it's itself). I could be wrong...

JordiManyer commented 9 months ago

Ok it does work, but it is indeed not necessary. I did some cleanup. We now only import save_object and load_object from JLD2. I don´t think it matters, but it is a little cleaner since it's the only things we use (I think?) All good from my end.

zjwegert commented 9 months ago

All looks good to me!

I'll review, add some docs, then merge.

zjwegert / GridapTopOpt.jl

Added PVector IO #37