flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
https://flyte.org
Apache License 2.0
5.41k stars 580 forks source link

[BUG] Wrong execution identity when "Recovering" stopped executions #5655

Open ddl-rliu opened 1 month ago

ddl-rliu commented 1 month ago

Describe the bug

When user A "recovers" via Flyteconsole UI an execution belonging to user B, it launches with user B's execution identity. This can have negative security implications. See "Additional context to reproduce".

Expected behavior

When user A recovers user B’s execution, it does not successfully launch with user B's execution identity. (as this can have negative security implications)

Open question: What is the expected behavior? Should the recover fail? Or should it succeed, but it is launched with user A's execution identity? (or, is this intentional?)

Additional context to reproduce

Steps to reproduce

Pre-requisite:

Expected behavior:

Actual behavior:

  1. As user A, trigger a recover of user B’s execution (This can be done in Flyteconsole UI)
  2. Check and see that the execution succeeds and is launched with user B's execution identity. (See Execution Identity for details)

Execution Identity

You can use a tool like grpcui to query ListExecutions

grpcurl -plaintext -H "<user A token>" -d '{
  "id": {
    "project": "66ac1f956fd5b26a885904ce",
    "domain": "development"
  },
  "limit": 5,
  "sortBy": {
    "key": "created_at"
  }
}' localhost:9089 flyteidl.service.AdminService.ListExecutions

Then select the execution from the response, and view the execution identity at response.executions[0].spec.security_context.run_as.execution_identity

Relaunch

The above all applies to the RelaunchExecution grpc endpoint as well, however this endpoint does not appear to be exposed via the UI so it is a less immediate concern.

Relaunch via gRPC

You can use a tool like grpcurl to query RelaunchExecution

grpcurl -plaintext -H "<user A token>" -d '{
  "id": {
    "project": "66ac1f956fd5b26a885904ce",
    "domain": "development",
    "name": "easily-ruling-crappie-aeps"
  }
}' localhost:9089 flyteidl.service.AdminService.RelaunchExecution

(note that this endpoint is not exposed via e.g. the Flyte UI)

Screenshots

No response

Are you sure this issue hasn't been raised already?

Have you read the Code of Conduct?

runllm[bot] commented 1 month ago

The issue described involves a security concern where user A can relaunch an execution belonging to user B, and the execution runs with user B's identity. This behavior is problematic as it can lead to unauthorized access and actions.

According to the ExecutionRelaunchRequest documentation, the RelaunchExecution endpoint does not inherently handle user identity changes. The expected behavior should be that the relaunch either fails or succeeds but runs under user A's identity.

Would you like more detail on how to address this issue or the specific steps to reproduce it?

Sources:


This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.

Mention @runllm to ask follow-up questions. See runllm.com for more!

ddl-rliu commented 1 month ago

Some extra context:

Intentional behavior

I am not sure, if this bug could actually be intentional behavior. The execution spec contains, in addition to the spec.security_context.run_as.execution_identity field, a spec.metadata.principal field. Then, for example, the flyteconsole UI has a "Only my executions" toggle, which filters based on metadata.principal, not execution_identity.

More extra context: (more relevant for the "relaunch" version of this bug)

gRPC vs HTTP

The steps to reproduce applies to the corresponding http endpoint as well; that is, /api/v1/executions/relaunch for RelaunchExecution. However it is easier to work with gRPC (e.g. doesn't require manually writing protobuf payloads), in terms of reproducing the behavior manually.

Flyte UI Relaunch

Flyteconsole allows users to relaunch executions, however it uses a different endpoint /api/v1/executions for CreateExecution. In fact the /relaunch endpoint appears unused in flyteconsole. Therefore the bug must be reproduced manually (i.e. not via natural UI usage)

ddl-rliu commented 3 weeks ago

I've updated the issue with a focus on the 'Recover' operation (which is possible via UI).

In addition to "Recover", there a similar bug with the Relaunch grpc endpoint. When user A relaunches user B’s execution, it will launch with user B's execution identity. (this too can have negative security implications)

Open question here: What is the expected behavior? Should the relaunch fail? Or should it succeed, but it is launched with user A's execution identity? (or, is this intentional?)