kbase / dts

A data transfer service
https://kbase.github.io/dts/
MIT License
0 stars 0 forks source link

Create an end-to-end demonstration of a file transfer #31

Closed jeff-cohere closed 6 months ago

jeff-cohere commented 9 months ago

This issue tracks all activity related to creating a demo for our DTS prototype that isn't captured elsewhere. It also documents the internal stages of the demo process in a way that encourages discussion and relates to technical issues.

File Transfer Procedure

  1. An authenticated user (logged into KBase with an Orcid ID, for now) selects a set of files to transfer from a DTS search of the JGI Data Portal. These files are identified by their unique (JAMO) IDs.
  2. The user makes a transfer request to DTS, including all relevant information:
    • source and destination databases (JGI Data Portal and KBase, respectively)
    • a list of file IDs identifying files to be transferred
    • the Orcid ID of the user
  3. The DTS forwards the transfer request to its task manager and begins processing it (described below).
  4. The DTS forwards all requests for transfer status to the task manager, which responds with the information, which is then sent back to the requestor. The possible ultimate outcomes for a transfer are success or failure (not counting "suspension," which Globus offers with an ability to pause a transfer).

DTS Task Manager Flow

The DTS task manager is a program that runs in a single execution thread and handles requests taken from thread-safe queues (via Golang's channels). The Task Manager tracks and updates all file transfers, which are stored as entries in a single table (the DTS's "source of truth"). Each file transfer is represented as a "transfer task" with all related transfer and status information.

Below, we describe the behavior of the task manager as it responds to various requests from the DTS service itself.

Creating a new transfer task

  1. The DTS requests that a new file transfer task be added to the list managed by the task manager.
  2. The task manager queries the source database for information on the requested file IDs.
  3. The source database responds with a list of Frictionless DataResources corresponding to the given file IDs.
  4. The task manager then requests that the source database begins the process of staging the files with the requested IDs on the filesystem attached to its transfer endpoint, obtaining a UUID to track the status of the staging operation.
  5. The task manager then records the staging status in the task, creates a UUID for the file transfer itself, and stores the task in its table of managed tasks using this UUID.
  6. The task manager returns the file transfer UUID to the DTS service, which propagates back to the user and may be used to request status information.

Handling transfer task status requests

  1. The DTS forwards a status request for a task with a given UUID to the task manager.
  2. The task manager uses the UUID to look up the corresponding task. If the task is not found, the task manager logs an error and notifies the DTS that no such task exists. The DTS reports this to the user.
  3. The task manager checks to see whether the files have yet been staged. If not, it notifies the DTS that the transfer is still staging.
  4. If the file has been staged, the task manager checks to see whether the file transfer has completed. If not, it notifies the DTS that the file transfer is in progress (or in some other state indicated by the underlying endpoint logic [e.g. Globus]).
  5. If the files have been successfully transferred but no manifest has been generated and transferred, the task manager notifies the DTS that the file transfer is being "finalized", which can be interpreted to mean that the DTS is generating and transferring the manifest.
  6. If the manifest has already been successfully transferred, the task manager notifies the DTS that the file transfer operation has completed. Information about the file transfer is retained by the DTS for a configurable time period.

Monitoring and advancing task progress

The DTS sends a ping to the task manager once during a specified time interval (configurable, 1 minute by default). This ping tells the task manager to check and update the state information for each task in its maintained table.

For each unfinished transfer task:

  1. If the task's status indicates that its files are still being staged, the task manager contacts the source endpoint to query whether the requested files have finished staging. If so, the task manager updates its state information and requests that the source endpoint initiate a transfer to the destination endpoint. For each file in the transfer, its path on the source endpoint is prepended by a prefix that identifies the user on the destination endpoint and the UUID of the file transfer (e.g. johnson/62a4ead3-86ca-459d-8a0c-53123a4e46a6. The source endpoint returns a unique UUID that the task manager uses to check the status of the file transfer.
  2. If the task's status indicates that its files are being transferred, the task manager queries the source endpoint for the status of the transfer. If the transfer has completed, the task manager generates a manifest file for the transfer locally and initiates the transfer of this manifest to the destination endpoint from its own local endpoint (all endpoints are currently implemented using Globus). This manifest transfer has an associated UUID that the task manager uses to check its status.
  3. If the task's status indicates that the manifest is being generated, the task manager queries its local endpoint for the status of this transfer. If the transfer has succeeded, the task manager marks the transfer task as completed.

Next Steps

jeff-cohere commented 7 months ago

Note: looks like the "root folder" for the NERSC DTN endpoint should be set to /global/dna/dm_archive/.