sigstore / model-transparency

Supply chain security for ML
Apache License 2.0
114 stars 33 forks source link

Support cloud filesystems #148

Open mihaimaruseac opened 8 months ago

mihaimaruseac commented 8 months ago

We need to support cloud filesystems to sign models stored in cloud buckets without downloading them locally. This is especially useful if we want to sign large models.

laurentsimon commented 8 months ago

I'd like if we could have an API that takes as input a file system interface. If we only need to support simple operations like read, write, list dir, do you think it would be viable?

mihaimaruseac commented 8 months ago

Was thinking to maybe do something similar to TF's GFile API and then our API will just have code like

with gfile.open(filepath) as f:
  do_something_with(f)

If the filepath is a local file then the local filesystem would be used, otherwise, the prefix URI scheme would redirect to the corresponding cloud implementation.

But we don't want a dependency on TF, so I'm searching to see if there is another library we can use instead

mihaimaruseac commented 6 months ago

I think we should make this a low priority, given there doesn't seem to exist an OSS dependency that doesn't bring in TF at this point.

mihaimaruseac commented 2 months ago

We can use etils[epath] for this, it implements the API without needing to also depend on TF