flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
https://flyte.org
Apache License 2.0
5.78k stars 659 forks source link

[BUG] remote sync map_tasks #3747

Open lauralindy opened 1 year ago

lauralindy commented 1 year ago

Describe the bug

Can't seem to use FlyteRemote to sync workflows/tasks with map_task

_InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
    status = StatusCode.NOT_FOUND
    details = "missing entity of type TASK with identifier project:"flytetester" domain:"development" name:"MAP_TASK_NAME" version:"VERSION" "
    debug_error_string = "UNKNOWN:Error received from peer ipv4:10.34.2.243:443 {grpc_message:"missing entity of type TASK with identifier project:\"flytetester\" domain:\"development\" name:\"MAP_TASK_NAME\" version:\"VERSION\" ", grpc_status:5, created_time:"2023-06-02T15:32:48.383186-07:00"}"
>

Expected behavior

expect to be able to see the map_task tasks' info

Additional context to reproduce

task_execution = client.get_task_execution(flytekit.models.core.identifier.TaskExecutionIdentifier(
    task_id=task_identifier, 
    node_execution_id=node_exe_identifier,
    retry_attempt=0
))
flyte_task_exe = flytekit.remote.executions.FlyteTaskExecution.promote_from_model(task_execution)
remote.sync_task_execution(
    execution=flyte_task_exe,
)

where task_identifier and node_exe_identifier is for the map_task

or call

synced = remote.sync(
    execution=execution,
    sync_nodes=True
)

on the workflow level.

Screenshots

No response

Are you sure this issue hasn't been raised already?

Have you read the Code of Conduct?

welcome[bot] commented 1 year ago

Thank you for opening your first issue here! 🛠

ggydush commented 1 year ago

Also seeing this, with and without sync_nodes being used. @wild-endeavor any thoughts on root cause? Mine seems to sync for a minute or two until the actual map task starts executing, then fails.

ggydush commented 1 year ago

Actually just kidding, it only happens with sync_nodes=True

eapolinario commented 1 year ago

We're in the process of revamping map tasks, this will involve changes in the backend to actually make map tasks more usable (and more extensible). This feature is scheduled to go out in the Flyte 1.8 release as an experimental feature.

Opting-in to this change will require simply importing map_task from a different package initially and eventually we're going to deprecate the old version of map_task. All that to say that it doesn't necessarily make sense to fix the flyteremote behavior in the current incarnation of map_task and instead we're going to wait for ArrayNode-based map tasks to land before we invest time to fix the bug described in this issue.

lauralindy commented 1 year ago

So will arrayNode-based map tasks be syncable then?

@ggydush kind of ugly but I was able to get it to work using the Swagger API. Making a call like f"https://{self.endpoint}/api/v1/task_executions/{self.project}/{self.domain}/{run_id}/{node_id}-0-{map_node['id']}?limit=10000" would return back the information for individual map tasks.

Was digging into how the UI was able to get the map task status and this was how they did it I think.