epam / badgerdoc

Apache License 2.0
31 stars 32 forks source link

Add tab to select pipeline engine #847

Closed khyurri closed 3 months ago

khyurri commented 3 months ago

Currently, BadgerDoc supports only one pipeline manager. However, after the 1.8.0 release, it will become necessary to allow users to configure multiple pipeline managers in a single BadgerDoc installation. Post-release, we plan to support two pipeline managers: Airflow and Databricks. Users can enable a pipeline manager using the AIRFLOW_ENABLED and DATABRICKS_ENABLED environment variables. The front-end application can check which pipelines are enabled by calling the GET /jobs/pipelines/support endpoint. This will return all available pipelines with their respective endpoints.

Depending on the number of pipelines returned, the front-end application must display a tab or a radio button on the interface for pipeline selection. To get a list of pipelines for a specific engine, the front-end must call either GET /jobs/pipelines/airflow or GET /jobs/pipelines/databricks. These endpoints will be returned by the call to the GET /jobs/pipelines/support endpoint.

This list should be displayed in a dropdown menu. The version dropdown should be removed from the code, as we won't be supporting pipeline versioning.

When a user starts a pipeline, the front-end must pass engine as an argument to start the job. This value is currently hardcoded in the EditJobConnector.

Interfaces needs to be changed

Extraction

Screenshot 2024-05-13 at 12 10 06

Extraction and Annotation

Screenshot 2024-05-13 at 12 10 40