I am trying to to build an AML pipeline for object detectionc/instance segmentation, where the last component would be used for training and model evaluation.
The pipeline is defined via the YAML format/schema (see below) and is run with az ml job create --file pipeline.yaml:
I want to highlight/visualize a lot of metrics in the Metrics tab of the component like time-series metrics (loss, f1 etc.), X/Y graphs, confusion matrix etc. As the MLFlow API only support time-series-like metric logging (log a single metric value in each iteration/epoch etc.), for logging more advanced metrics, I try to use the azureml.core.Run.log* interface. The problem is that, these logs are only logged into the Output + logs as json files and not as metrics/graphs into the Metrics tab if they are logged at all. Here are the problematic metric logs:
azureml.core.Run.log_table(): This is not logged at all, nor into the Outputs + logs tab, nor into the Metrics tab.
Here are some screenshots of the Azure ML dashboard.
The first pic shows that run.log_accuracy_table() and run.log_confusion_matrix() are logged as json file artifacts but run.log_table() is not:
The second pic shows that neither of the run.log_*() metrics are visualized in the Metrics tab:
IMPORTANT
If I run a simple python script as a job (so no pipeline definitions etc.) the run.log_accuracy_table(), run.log_confusion_matrix() and _run.log_table() metrics are logged properly.
Is this behaviour just a bug related to child jobs?
Hi everyone,
I am trying to to build an AML pipeline for object detectionc/instance segmentation, where the last component would be used for training and model evaluation.
The pipeline is defined via the YAML format/schema (see below) and is run with
az ml job create --file pipeline.yaml
:I want to highlight/visualize a lot of metrics in the Metrics tab of the component like time-series metrics (loss, f1 etc.), X/Y graphs, confusion matrix etc. As the MLFlow API only support time-series-like metric logging (log a single metric value in each iteration/epoch etc.), for logging more advanced metrics, I try to use the azureml.core.Run.log* interface. The problem is that, these logs are only logged into the Output + logs as json files and not as metrics/graphs into the Metrics tab if they are logged at all. Here are the problematic metric logs:
azureml.core.Run.log_table()
: This is not logged at all, nor into the Outputs + logs tab, nor into the Metrics tab.azureml.core.Run.log_accuracy_table()
: This is logged only into the Outputs + logs tab as a json file.{"schema_type": "accuracy_table", "schema_version": "1.0.1", "data": {"probability_tables": [[[82, 118, 0, 0], [75, 31, 87, 7], [66, 9, 109, 16], [46, 2, 116, 36], [0, 0, 118, 82]], [[60, 140, 0, 0], [56, 20, 120, 4], [47, 4, 136, 13], [28, 0, 140, 32], [0, 0, 140, 60]], [[58, 142, 0, 0], [53, 29, 113, 5], [40, 10, 132, 18], [24, 1, 141, 34], [0, 0, 142, 58]]], "percentile_tables": [[[82, 118, 0, 0], [82, 67, 51, 0], [75, 26, 92, 7], [48, 3, 115, 34], [3, 0, 118, 79]], [[60, 140, 0, 0], [60, 89, 51, 0], [60, 41, 99, 0], [46, 5, 135, 14], [3, 0, 140, 57]], [[58, 142, 0, 0], [56, 93, 49, 2], [54, 47, 95, 4], [41, 10, 132, 17], [3, 0, 142, 55]]], "probability_thresholds": [0.0, 0.25, 0.5, 0.75, 1.0], "percentile_thresholds": [0.0, 0.01, 0.24, 0.98, 1.0], "class_labels": ["class1", "class2", "class3"]}}
azureml.core.Run.log_confusion_matrix()
: This is logged only into the Outputs + logs tab as a json file.{"schema_type": "confusion_matrix", "schema_version": "1.0.0", "data": {"class_labels": ["class1", "class2", "class3", "class4"], "matrix": [[4, 0, 1, 9], [0, 0, 0, 1], [6, 0, 5, 0], [0, 0, 0, 1]]}}
The codes used for these logs are as follows:
Here are some screenshots of the Azure ML dashboard.
run.log_accuracy_table()
andrun.log_confusion_matrix()
are logged as json file artifacts butrun.log_table()
is not:run.log_*()
metrics are visualized in the Metrics tab:IMPORTANT
If I run a simple python script as a job (so no pipeline definitions etc.) the
run.log_accuracy_table()
,run.log_confusion_matrix()
and_run.log_table()
metrics are logged properly.Is this behaviour just a bug related to child jobs?