Closed nfzd closed 8 months ago
@nfzd I think this is exactly the scenario for which path substitution was introduced, is it not?
It should simply be configured for each consumer for which the original registered URL is inadequate.
@ainoam Ah, nice. I was not aware of that.
Proposal Summary
Make the base
output_uri
of Dataset artifacts configurable somehow if the storage location changes.One possibility would be to add a kwarg to
Dataset.finalize()
andDataset.get()
which can rename the artifact URIs. I could send a PR for that if you agree this is a good solution?Motivation
We need to store our datasets on a network drive. We also have Linux workers and users with Windows.
The network drive has some location, say,
/mnt/data
on the workers. This path cannot be be used on Windows, where it will be something likeZ:\
. (We tried some hacks with network paths on Windows, but did not find a working solution.)Windows users should be able to both create datasets and use them locally. The Linux agents also need to be able to load them.
Proposal
The clean solution IMHO would be to store path that the agents will use on the server. This would require something like:
output_uri='Z:\'
Dataset.finalize()
with an extended version which can renameZ:\
to/mnt/data
Dataset.get()
with an extended version which can rename/mnt/data
back toZ:\
The extension would in both cases be something like a kwarg
which (if passed) can rename the artifact paths before saving in
finalize()
and before loading inget()
. You would call it, in our case, with:Related: #747