Closed tmcqueen-materials closed 1 month ago
This method was obviously created for box. Is there any system I should be targeting for my validation method? The first thing that comes to mind is girder. If not should I treat things like they are on my local system?
For the function highlighted there, if the boolean "allow_box" is set to false, then it does exactly what you'd hope would be done here for name checking: (1) normalize the path; (2) get the absolute path; (3) ensure that the path is the same after renormalization; (4) ensure that the input file/directory exists (allow_nonexistent_leaf is False, as it should be for a file-based input to project chameleon functions); and (5) ensure that the output file/directory parent exists (allow_nonexistent_leaf is True, as it should be for a file-based output to project chameleon functions).
So basically, that function, lines 35-55, can be used wholesale here for project chameleon path validation. Obviously only applied to input if it is a user specified input file, and only applied to output if it is a user specified output file.
User specified input files are validated before being processed. User specified output files are validated before being returned.
The project chameleon API is planned to be ~world accessible. While it won't take action without passing the authentication check (which right now requires a login), it is still important to have defense in depth measures in place to avoid trivial attacks from an endpoint that happens to have access.
All input and output filenames received from the caller should go through a validation process similar to:
https://github.com/paradimdata/file_copy_service/blob/4634c61accc3d349bb2387b91bafecd236275c1d/api.py#L27-L57
And then also be included (as "read_path" and "write_path" params) in the authorization call. "read_path" and "write_path" can be an empty string (if source is bytes provided by the caller, or destination is returning directly to the caller respectively), can be a directory string (for those calls that work on directories of entries), or can be an array of strings if multiple files/directories need to be referenced.