Open jarandaf opened 3 years ago
I generally find querying AML runs through the Python SDK fairly slow - will assign the hyperdrive team to check if anything specific but I suspect not
Is there any other more performant way to query AML runs?
We are also experiencing the same issue. Are there any updates or suggestions on this?
Latest conversations I had with the engineering team confirmed the issue but there is no fix yet AFAIK. Long running times usually appear for array-like metrics (e.g. training loss over epochs). For single-value metrics the following is a possible workaround and runs way faster despite having other logged array-like metrics:
metrics = {run.id:run.get_metrics('<metric_name>') for run in hdrun.get_children()}
@jarandaf thanks a lot for the insights 👍
However, with the suggested workaraund I got only slightly better performance (5mins vs 6mins) compared to
hdrun.get_metrics(name="<metric_name>", recursive=True)
this is with a scalar-value metric over 1000 child runs.
EDIT: By mistake I apparently applied a trick here and created hdrun object with the constructer of Run class. The child class HyperDriveRun for some reason does not accept these arguments, unlike its parent.)
Hi @mx-iao, I was advised to loop you in here. Do you have any solution or know anyone who might?
We notice long execution times (of the order of ~5 minutes) when retrieving hyperdrive results (hyperparameter results). We log simple values and some lists during the hyperdrive step execution, which does not have more than 300 child runs. Is this the expected behaviour?