Open cisaacstern opened 5 days ago
I've thrown the below together mostly just to help me think about this
sequenceDiagram
actor User
participant WebUI as Web UI
participant EcoscopeServer as Ecoscope Server
participant EcoscopeWorkflows as Ecoscope Workflows CLI Service (Process, Serverless)
participant SecureStore as GSM?
participant WorkflowExecutor as Workflows Executor API (Airflow, Serverless)
participant ThirdPartyDataService as 3rd Party Data Service (EarthRanger)
User->>WebUI: Fill Form for First Config Block
WebUI->>EcoscopeServer: Post First Config Block
EcoscopeServer-->>WebUI: Show Next Config Block
WebUI-->>User: Show Next Config Block
User->>WebUI: Fill Form for Data Connection
WebUI->>EcoscopeServer: Post Config
EcoscopeServer->>EcoscopeWorkflows: Create Data Connection
EcoscopeWorkflows->>SecureStore: Store data connection config
SecureStore-->>EcoscopeWorkflows: Config stored
EcoscopeWorkflows-->>EcoscopeServer: Data Connection created
EcoscopeWorkflows->>ThirdPartyDataService: Get Table Schema
ThirdPartyDataService-->>EcoscopeWorkflows: Table Schema
EcoscopeWorkflows-->>EcoscopeServer: Table Schema
EcoscopeServer-->>WebUI: Show Next Config Block
WebUI-->>User: Show Next Config Block
Loop Continue config
User->WebUI: Config continues
WebUI-->>User:
end
User->>WebUI: Run "Patrols Example"
WebUI->>EcoscopeServer: Run "Patrols Example" (knows executor)
EcoscopeServer->>WorkflowExecutor: Run "Patrols Example"
WorkflowExecutor-->>EcoscopeServer: Run started
EcoscopeServer-->>WebUI: Status "Pending"
WebUI-->>User: Status "Pending"
WorkflowExecutor->WorkflowExecutor: Execute tasks
WorkflowExecutor->>EcoscopeServer: Get Data Connection config
EcoscopeServer-->>WorkflowExecutor: Return data connection with location of credentials
WorkflowExecutor->>SecureStore: Get Data Connection credentials
SecureStore-->>WorkflowExecutor: credentials
WorkflowExecutor->>ThirdPartyDataService: get_patrol_observations
ThirdPartyDataService-->>WorkflowExecutor: patrol observations
Per @atmorling's comment https://github.com/wildlife-dynamics/ecoscope-workflows/pull/31#issuecomment-2186022263, we should research the best way to store and materialize data connection config (inclusive of secrets) inside Cloud Run invocations.
The Google Secrets Manager (GSM) reference implementation in #31 is one possible path, but questions remain regarding how to ensure per-user materialization of config in Cloud Run, as well as if/how managing 100s or 1000s of user-specific secrets will scale with GSM as opposed to some other approach (encrypted database, etc.).
So to summarize, the main questions are: