Closed PeerHerholz closed 5 years ago
Hi. Would Joblib's load and dump work for you? https://joblib.readthedocs.io/en/latest/persistence.html
Hi,
sorry for the delayed reply.
Yeah, this looks like what I was thinking about.
Is there any reason to not "simply" use json
?
In case you think it would be worth adding to nistats
and no one else is on it, I would be happy to give it a try.
Is there any reason to not "simply" use
json
?
it would not be practical to represent Nistats objects using only json or any text format. remember that they reference numpy arrays, Nifti images (the masks), and many other python objects. to store Nistats objects in a reusable way, one would have to decide how all this information should be organized and stored on disk. It could be useful but would require a bit of thought and work.
serializing them with joblib is a good solution for temporary storage, caching or sending them over a network. However it may be risky for long-term storage (for example you may have difficulties loading them again with future versions of Nistats). Therefore is may be worthwhile also storing the information you are interested in, for example computed contrasts, in standard formats such as .nii.
Oh I see, sorry for not thinking this through. You're completely right in that nistats
objects are rather complex. Would hdf5
be a possible option re long-term storage/support? Sure, it won't overcome the problem re future nistats
versions, but it could be worth checking out.
I am not sure that is a very common use case. We haven't received requests or feedback for this feature. Building model storage using HDFS into Nistats will also mean maintaining it for a while.
Unless there's a significant demand for the feature or growing consensus and demonstration that the feature will improve the science (for example reproducibility), that's a lot of resources to commit.
I don't expect us to implement this as an integral feature in the foreseeable future.
Ok, thank you very much your thoughts and inputs on this. If it's not a thing that a lot of folks are interested in/demanding, then the effort won't be worth it at this point.
Do you mind if I play around with it in an independent repo?
You don't even need to ask us what you do in your fork :smile: . Please go ahead.
True that, but I just wanted to be sure, as I don't want to start anything you, as the main developers, are completely against.
I'm closing this issue now. Thanks again for all the feedback, input and ideas.
Ahoi hoi everyone,
I was just redoing some previous analyses that were done in
nistats
. While doing so, I thought about a potential new feature that could be added.What would you like changed/added and why?
I would like to add the possibility to save
nistats models
as e.g.dict
orjson
, because this would enhance reproducibility (as e.g., models could be shared, regenerated from file and rerun), furthermore increasing documentation.What would be the benefit? Does the change make something easier to use?
Models could be saved for later inspection and evaluation, as well as shared. As
nistats
as already very straightforward and easy to use (thanks for that!), I don't think that this feature would make some things easier to use, but add a new layer/level of usage (e.g., generate models from files, rerun models, etc.).Clarifies something?
No.
If it is a new feature, what is the benefit?
As outlined above: increase reproducibility and documentation through model inspection and evaluation, sharing, generation from file and rerunning.
It would be cool to hear your thoughts on this.
Best regards, Peer