WLCG-AuthZ-WG / common-jwt-profile

A repo for the WLCG Common JWT profile document
3 stars 8 forks source link

Poorly specified renaming/moving authorisation #33

Open paulmillar opened 11 months ago

paulmillar commented 11 months ago

On page 12, the storage.create description says:

storage.create​: Upload data. This includes renaming files if the destination file does not already exist. [...]

Three questions:

  1. The description talk about "renaming files". Does this also apply to renaming directories?
  2. Is "renaming" limited to operations in which the resource stays in the same directory, or can it also involve moving a file to a new directory?
  3. A move/rename operation are typically describe by two paths (a source and a destination). How do these relate to the resource path ($PATH) in storage scopes? For example, does the token need to grant storage.create authorisation to both the source and destination paths?

The document should be updated to make the answers to these three questions clear.

maarten-litmaath commented 6 months ago

My take on the answers at this time:

  1. The idea was that only files can be renamed with storage.create, while anything can be renamed with storage.modify. The rationale is that a file may be uploaded with a temporary name, to be subsequently renamed using the same token, but only if the upload was successful, and to be removed otherwise. The reason for that is to commit to the proper name in the SE namespace only when the corresponding file supposedly is in good shape on that SE. By contrast, a directory can be created, or fail to be created, in a single operation. One could also envisage a use case for temporary directories, though, which would only get renamed once their contents are deemed to be in good shape. I doubt the importance of that use case in practice, though. There is a case for limiting the rename functionality to files only: reducing potential abuse.
  2. The idea was for a temporary file name typically to be the proper file name with an extension (e.g. like ".$TIMESTAMP") signaling the temporary nature of the name. The temporary file would thus be in the same directory as the renamed file. In principle we could permit files to be renamed across directories, as it would allow uploads to be done into a directory created just for that purpose, but also here we have a case for limiting the functionality: reducing potential abuse.
  3. A rename operation has to work within the file tree permitted by the corresponding storage.create capability. For example, storage.create:/foo/bar permits a file to be uploaded into /foo/bar/name.txt.1704651392 and subsequently renamed to /foo/bar/name.txt, but not to /foo/qux/name.txt, even when the same token is also equipped with storage.create:/foo/qux as a capability. For the latter use case, a token would need storage.modify:/foo/bar as well as storage.modify:/foo/qux to be provided, possibly implicitly e.g. through a single storage.modify:/foo capability.

What do people think?

paulmillar commented 6 months ago

For comparison, the xroot protocol supports a kXR_POSC flag: "persist on successful close". This does something rather similar to what you describe: it updates the namespace only if the upload was successful. Therefore, I think I think you are describing a reasonable use-case, as xrootd developers has already baked support for this into their protocol.

More generally, rename is a somewhat awkward operation, as it has two paths: a source path and a destination path, so one way of defining rename might be to define the authorisation necessary on the source path, and again what is needed on the destination path.

Something that might be worth considering is if the client has storage.read on the source path and storage.create on the destination path then that client can simply download the data and upload it. This would have much the same effect as renaming the file (the data is available under a new path).

The idea for supporting renaming of files with storage.create seems reasonable (to support the above use-case), but I haven't given it too much thought. Limiting source and destination to files in the same directory seems a bit arbitrary, but might be OK.

norealroots commented 6 months ago

Something that might be worth considering is if the client has storage.read on the source path and storage.create on the destination path then that client can simply download the data and upload it. This would have much the same effect as renaming the file (the data is available under a new path).

I would note that to properly mirror this, you are missing the storage.delete which would be involved in a full rename, as opposed to a copy where you make no changed to the original data.

enricovianello commented 6 months ago

Agreeing with @norealroots I try to summarize a bit:

operation source path needed scopes destination path needed scopes
COPY storage.read storage.create
MOVE/RENAME storage.modify storage.create

If the aim of this discussion is to better explain what storage.create scope allows to do, surely "renaming files" is not enough clear. Renaming/moving/copying are more complex actions that probably can be better explained with examples. I'd propose to change: This includes renaming files if the destination file does not already exist. This capability includes the creation of directories and subdirectories at the specified path to: This capability includes the creation of files, directories and subdirectories at the specified path and then add a "renaming/moving resources" section dedicated to that kind of operations.

What do you think?

maarten-litmaath commented 6 months ago

I think it can be defended that for a rename operation not covered by storage.create, which exists to support data upload, storage.modify is needed also for the destination, because the file (or directory) is not uploaded.

That said, we will need to spell out the results from the discussion in this issue, probably in a small subsection as suggested, to which the items on storage.create and storage.modify would then refer.