In the Fluid project, data operations (dataprocess, dataload, datamigrate) are core functionalities, including data processing, preheating, and migration. To better monitor and manage these operations, a self-reporting progress feature needs to be implemented. This feature will present the work progress of data operations in their status.
Objectives
Design a general mechanism to uniformly present progress in Fluid's data operation CRDs. This mechanism should be similar to Argo's Self-Reporting Progress, where users specify the progress updates in a file within the container, and Fluid's controller updates the status in the CRD.
Implement a proof-of-concept solution using DataProcess.
Feature Requirements
Progress Reporting Mechanism:
Each data operation task should be capable of generating progress reports during execution.
The progress reports should follow an N/M format, where N is the completed amount of work and M is the total amount of work.
Environment Variable Configuration:
Define an environment variable FLUID_PROGRESS_FILE, which specifies the location of the progress report file.
Progress Report File:
The data operation task must periodically update the FLUID_PROGRESS_FILE during execution to report the current progress.
Executor Reading Mechanism:
The executor should periodically (e.g., every 3 seconds) check the FLUID_PROGRESS_FILE to get the latest progress information.
Progress Annotation:
Upon task initiation, the task's metadata should set an initial progress annotation, such as fluid.io/data-progress: 0/100.
Progress Update:
If the FLUID_PROGRESS_FILE is updated, the executor should update the task's annotation to reflect the latest progress.
Progress Display:
The monitoring system should be able to read the task's annotations and display the real-time progress of each data operation task on the user interface.
Background
In the Fluid project, data operations (dataprocess, dataload, datamigrate) are core functionalities, including data processing, preheating, and migration. To better monitor and manage these operations, a self-reporting progress feature needs to be implemented. This feature will present the work progress of data operations in their status.
Objectives
Feature Requirements
Progress Reporting Mechanism:
N/M
format, whereN
is the completed amount of work andM
is the total amount of work.Environment Variable Configuration:
FLUID_PROGRESS_FILE
, which specifies the location of the progress report file.Progress Report File:
FLUID_PROGRESS_FILE
during execution to report the current progress.Executor Reading Mechanism:
FLUID_PROGRESS_FILE
to get the latest progress information.Progress Annotation:
fluid.io/data-progress: 0/100
.Progress Update:
FLUID_PROGRESS_FILE
is updated, the executor should update the task's annotation to reflect the latest progress.Progress Display:
Example Code
This example provides a basic framework.