Closed dkarlo2 closed 1 week ago
Opened a PR with a possible fix for this if you want to give it a spin. I was unable to get my test flows to outright fail, but did notice that the runs sometimes register with incorrect project info as reported.
This being so timing dependant makes reliable testing a bit flaky, but from what I observed, adding the missing envs did seem to fix the issue.
looping back on this, the complete fix was introduced in https://github.com/Netflix/metaflow/releases/tag/2.12.22 where the argo workflows daemon now correctly registers all project related system tags for a flow.
When heartbeat daemon is enabled, running a flow deployed in production namespace sometimes results in a run executed under user namespace. This often causes
Object not in the current namespace
errors, and often runs fail. This seems to happen when heartbeat daemon starts before the start step. I identified a probable cause to this issue - some Metaflow environment variables are missing when deploying daemon (e.g. METAFLOW_PRODUCTION_TOKEN) here. What I assume happens is that daemon here registers a run in user namespace instead of production namespace.