Currently, BadgerDoc supports only one pipeline manager. However, after the 1.8.0 release, it will become necessary to allow users to configure multiple pipeline managers in a single BadgerDoc installation. Post-release, we plan to support two pipeline managers: Airflow and Databricks.
Users can enable a pipeline manager using the AIRFLOW_ENABLED and DATABRICKS_ENABLED environment variables. The front-end application can check which pipelines are enabled by calling the GET /jobs/pipelines/support endpoint. This will return all available pipelines with their respective endpoints.
Depending on the number of pipelines returned, the front-end application must display a tab or a radio button on the interface for pipeline selection. To get a list of pipelines for a specific engine, the front-end must call either GET /jobs/pipelines/airflow or GET /jobs/pipelines/databricks. These endpoints will be returned by the call to the GET /jobs/pipelines/support endpoint.
This list should be displayed in a dropdown menu. The version dropdown should be removed from the code, as we won't be supporting pipeline versioning.
When a user starts a pipeline, the front-end must pass engine as an argument to start the job. This value is currently hardcoded in the EditJobConnector.
Currently, BadgerDoc supports only one pipeline manager. However, after the
1.8.0
release, it will become necessary to allow users to configure multiple pipeline managers in a single BadgerDoc installation. Post-release, we plan to support two pipeline managers: Airflow and Databricks. Users can enable a pipeline manager using theAIRFLOW_ENABLED
andDATABRICKS_ENABLED
environment variables. The front-end application can check which pipelines are enabled by calling theGET /jobs/pipelines/support
endpoint. This will return all available pipelines with their respective endpoints.Depending on the number of pipelines returned, the front-end application must display a tab or a radio button on the interface for pipeline selection. To get a list of pipelines for a specific engine, the front-end must call either
GET /jobs/pipelines/airflow
orGET /jobs/pipelines/databricks
. These endpoints will be returned by the call to theGET /jobs/pipelines/support
endpoint.This list should be displayed in a dropdown menu. The version dropdown should be removed from the code, as we won't be supporting pipeline versioning.
When a user starts a pipeline, the front-end must pass
engine
as an argument to start the job. This value is currently hardcoded in theEditJobConnector
.Interfaces needs to be changed
Extraction
Extraction and Annotation