whole-tale / globus_handler

Whole Tale Globus Transfer Handler plugin
0 stars 1 forks source link

Globus Transfer Handler Plugin for Girder/WholeTale

The Globus provider uses the infrasturecture provided by Globus to transfer files. Globus is a managed transfer service that initiates and supervises transfers between two GridFTP servers. This implies that the WholeTale deployment must have some form of GridFTP server running. The Globus provider automatically manages a Globus Connect Personal Server for this purpose.

Unlike the HTTP provider where the code directly controls what hapens to the bits of data that come from the server, the Globus provider requires more work in ensuring proper data isolation and authorization. Specifically, users should not be able to see or modify other user's data through the GridFTP server used on the WholeTale deployment.

A rough overview of the way the implementation works is as follows:

The implementation uses one thread per transfer. This is not ideal since most of the time is spent polling Globus for the transfer status while holding resources associated with the thread. A more efficient (but possibly more complex and error-prone) implementation would use a single thread to handle all transfers.

Refresh tokens are used wherever possible in order to deal with the limited validity of normal tokens. In general, due to the relatively large time frames associated with token validity, testing (automated or manual) is unlikely to immediately catch problems arising from an improper handling of token validity, so confidence on the correctness of this aspect is lower than for other parts.

The correctness of the solution rests on the assumption that data on shared endpoints is only accessible to the user with RW rights granted by the ACL rule. The Globus documentation appears vague around this point in that it clarifies what the explicit rule does, but not what the implicit rules are. Through testing, it appears that this is indeed correct: only the user with the explicit granted rights can see and write to the shared endpoint. Further checks would be necessary to ensure that the main endpoint cannot be used to circumvent the rules of the shared endpoints.

Configuration

globus_handler has the following configuration options, exposed through the standard Girder plugin configuration interface:

dm.globus_root_path

A directory where that the Globus provider can use as a temporary drop location for files. Once Globus Online finishes transfering a file to <dm.globus_root_path>, the file is moved to its final destination (<dm.private_storage_path>) using the operating system's defautl move operation. It is, therefore, recommended that <dm.globus_root_path> be located on the same filesystem as <dm.private_storage_path>.

dm.globus_gc_dir

A directory containing an unpacked Globus Connect Personal Server (i.e., it must contain the globusconnectpersonal executable).

oauth.globus_client_id

oauth.globus_client_secret

The Globus provider requires that users be logged in to WholeTale using OAuth and through Globus. It, therefore, re-uses some of the settings that the OAuth plugin alrady requires. Specifically, the Globus client id and client secrets from OAuth are used.