datalad / datalad-remake

Other
0 stars 0 forks source link

Define API for recording/setting compute instructions in dataset #4

Open mih opened 4 months ago

mih commented 4 months ago

From a user POV, we want to present a compute-on-demand like a download-on-demand, and wrap everything into a git-annex special remote. This means that we are bound to that protocol, which translates to an API that has the request-this-key as the main entrypoint.

So at the start of an operation, we only know which key is requested. Therefore the instruction on computing a key needs to be (discoverably) recording in association with a particular key.

Three established patterns for storing key-based information are known:

Challenges:

Candidate solutions:

christian-monch commented 1 week ago

In a first implementation https://github.com/christian-monch/datalad-compute, a POC that will turn in an MVP, the first option for key-based information storage was chosen, i.e. "URL-encoded parameter list via an added "availability URL", as done in https://github.com/matrss/datalad-getexec"