Open sebaB003 opened 1 month ago
Hello, is this a feature that should be implemented partially in aiida-core also? I see that the error originates from here: https://github.com/aiidateam/aiida-core/blob/98ffc331d154cc7717861fde6c908ec510003926/src/aiida/tools/archive/imports.py#L1161
that means that to solve the issue it is necessary to implement an hash -> uuid translation mechanism?
is it better to try to generalize the aiida-core routine or to build a custom command in aiida-s3 to do the import/export?
thank you
The problem is indeed that the archiving implementation in aiida-core
currently assumes that the repository implementations use the same format for the object keys. Since the archive is a special case of the StorageBackend
, it is just another storage and the archiving mechanism uses this to simply stream data from one storage to another, but in doing so requires the repositories keys are compatible. In the archive storage, the keys are the SHA of the object themselves, but for aiida-s3
the key is just a UUID4. Not sure how difficult it will be to generalize the archiving implementation in aiida-core
that would allow aiida-s3
backends to also be used. I am now working in industry though and it is unlikely I will have the time for it anytime soon.
Thank you! So we will work on this. We already did a proof-of-concept that is able to import archives generated with a standard aiida installation into a deployment done with aiida-s3. We need to work a bit on the aiida-core side, so we will start a discussion there
reference to the proof-of-concept commits:
https://github.com/rikigigi/aiida-core/commit/bf04cb832f5288a366bbf52ea2b309075e7087ee https://github.com/rikigigi/aiida-core/commit/454701999825133e3cb9ea075e3eae07f31703dc
https://github.com/rikigigi/aiida-s3/commit/4870c065a5bb0c762b5ac22ff73c1ee01aba962f
I was trying to import an exported archive but I found out that it isn't implemented in aiida-s3. The ability to import data from the standard object storage to aiida-s3 and vice-versa is a must have to allow the data sharing between multiple types of deployments.
Is there any plan to implement this feature, otherwise can you teach us how to do it, or point us to to useful resources?
Thank you!