PrefectHQ / prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
https://prefect.io
Apache License 2.0
15.32k stars 1.5k forks source link

Ability to Log a URL when a Flow is Started #12007

Open jhamet93 opened 4 months ago

jhamet93 commented 4 months ago

First check

Prefect Version

2.x

Describe the current behavior

When a Flow starts running, it's challenging if need be, to discover the underlying infastructure the Flow is running on if needed be. As a concrete example, I may have a Flow running on Vertex AI that's failing because of memory issues. However, I would need to manually navigate to GCP and find the appropriate job that corresponds to the Flow run.

Describe the proposed behavior

At the beginning of a Flow run, if implemented by different Infrastructure types of a Deployment, log the URL of the infrastructure. For example, Vertex AI implementations like above could hit the metadata server to aggregate the required information to form a fully qualified URL.

Example Use

No response

Additional context

No response

serinamarie commented 4 months ago

Hi @jhamet93, thanks for the issue!

How are/would you be doing this in pure Python without Prefect? You can surely form the url of your current job by retrieving your current GCP project & job, as a task, for example, or even set up a state change hook or something if you only care about failing flows.

jhamet93 commented 4 months ago

@serinamarie

This is surely possible within the context of a specific job. For this specific scenario, use your favorite HTTP compatible library to gather the data and compute a valid URL.

I was thinking more along the lines if it would feasible outside of the specific job context and move the responsibility to the business logic that submits the job for execution. This can help debug shorten the time to debug any infrastructure related problems (e.g. why has my job not started yet, my job failed but Prefect hasn't reflected this, etc).