Open zeryx opened 8 years ago
I like the idea. Checking that .parent()
is a directory is sufficient and straight-forward, but there are a few caveats:`
data://.my/bar.dat
even though the parent exists as a dirdata://.algo
is weird. Do we just whitelist .algo/temp/*
, .algo/perm/*
, .algo/*/*/perm/*
, and .algo/*/*/temp/*
Alternatively, we could try and create a zero-byte file before uploading. If the zero-byte file succeeds, then upload the larger file. I like the simplicity, but there is a question of correctness to this approach: What if creating the 0-byte file succeeds, but larger copy fails, now you have an empty file you didn't expect. We could try to delete it, but that could also fail. Also, what if some connector API doesn't allow 0-byte files?
You could also extend HEAD
with an additional response header specifying the ACL on the file path (irrespective of if the file exists) that way you can validate both path sanity and write permissions
That would address all the path and write permissions that WE manage. It won't help the case of IAM role restricting write access for s3://
data, but that may be a reasonable trade-off.
I also would really love HEAD to return ACLs (specifically: is a given directory listable?) for this AlgoFS issue: https://github.com/anowell/algorithmia-fuse/issues/1
FWIW I think this is an issue with all of our clients - not sure if it's something that should go into spec?
Originally, I thought this really only applied to the CLI as a convenience for algo cp
under the premise that users of the language clients could easily check before writing to a file. But given the complexity of checking, I'm increasingly willing to agree that clients could use some help making this determination before uploading large files. Some ideas:
file.isWriteable()
method. (client users could explicitly make this check if they want it)put
and putFile
auto-magically check and fail early (1 extra HEAD request for every write)put
and putFile
check and fail early, but only for large filesOpinions? Any of these would merit a change to the client spec. (I lean toward isWriteable
because I tend to prefer explicit over magic that has overhead, and because I think we'll want to expose ACL data anyway over time)
Both 1 and 3 :+1: (and a file.getAcl()
/ file.isReadable()
) and for the connector hidden ACL issue you 'could' enable users to add that info to Algorithmia's DB as informational so those calls also have full ACL information ... but that's more a longer term thing to prioritize
When uploading a file to a collection using
algo cp
, it doesn't check to make sure the path exists first, it would be incredibly nice if it did.This is a problem when uploading large files; as in that kind of situation you'd normally prefer the upload attempt to fail early if there are any simple bugs like path name invalidity.