Closed Joseda8 closed 2 weeks ago
I believe a scalable solution could be to modify the file services/metadata_service/server.py as follows:
# Define parameters configurable by the user
db_config_params = {
"timeout": os.environ.get("MF_METADATA_DB_TIMEOUT", 60),
}
the_app = app(loop, DBConfiguration(**db_config_params), path_prefix=PATH_PREFIX)
I'd like to open a pull request with this change.
based on https://github.com/Netflix/metaflow-service/blob/master/services/utils/__init__.py#L264 the environment variable should already be supported. Is this not working?
As I noted on Slack as well, there are some scaling issues with certain api routes that a simple timeout will not resolve, but a thorough fix is coming for these.
In the mean time, you should be able to access a specific run directly with
from metaflow import Run
run = Run("FlowName/run_id")
which skips the problematic endpoint
@saikonen, thanks a lot for the explanation. I confirm the existence of the variable MF_METADATA_DB_TIMEOUT
. Thanks for pointing it out! Also thanks for the workaround on the usage of Run("FlowName/run_id")
.
I'm using Metaflow in Python to get data from a specific run. The way to do this is very explicit and clear on the Metaflow documentation:
This accesses to the endpoint /flows/{flow_id}. Very often this operation fails with a
504 Gateway Time-out
. This is a general issue with any request sent to the specifiedMETAFLOW_SERVICE_URL
. Is there a way to extend the timeout or implement a retry mechanism?Update: It seems there's a
TODO
comment to make this happen in: services/data/postgres_async_db.py