Closed xjlwi closed 2 years ago
Hi @xjlwi, sorry to see that you are facing issues with the plugins. There are two problems here:
Would you mind give me some extra informations:
mlflow
versionavg_pnl
(I guess it is a kedro_mlflow.io.metrics.MlflowMetricsDataSet
?) If yes, check the documentation: it should return something like {'trader1': {'step': 0, 'value': df.pnl.mean()}}
avg_metric
and total_metric
: are they float
instead of string?df.pnl.mean()
and df.pnl.sum()
returns a float and not a single-row pandas.Series
?If I can reproduce the bug, I will be able to give you a workaround.
Hi @xjlwi, sorry to see that you are facing issues with the plugins. There are two problems here:
- kedro-mlflow logs an incorrect metric. We will solve the problem together.
- mlflow does not complain when the incorrect metric is logged, but it breaks the database and hence the UI => we should open an issue in mlflow repo once we know what is going on.
Would you mind give me some extra informations:
mlflow
version: 1.26.1- the catalog entry for
avg_pnl
(I guess it is akedro_mlflow.io.metrics.MlflowMetricsDataSet
?) If yes, check the documentation: it should return something like{'trader1': {'step': 0, 'value': df.pnl.mean()}}
:
Yes it's a kedro_mlflow.io.metrics.MlflowMetricsDataSet
.
type: kedro_mlflow.io.artifacts.MlflowArtifactDataSet data_set: type: pandas.CSVDataSet filepath: "${ml_model_output}PnL_summarymetrics${currentdate}${model}.csv" save_args: index: True
Must the keywords for the output be specifically 'step'? This is my current node to return the output.
def pnl_metrics(df:pd.DataFrame): avg_pnl = {} avg_pnl[f'{avg_metric}'] = {'trader1': df.pnl.mean()} avg_pnl[f'{total_metric}'] = {'trader1': df.pnl.sum(), 'trader2': df.pnl.sum()} return avg_pnl
- the type of
avg_metric
andtotal_metric
: are theyfloat
instead of string? Definitely float, because in my local mlruns folder, I am able to see them from the mlruns>metrics folder.
1660874133345 [{'ml_model_13_logit_pnl_total': 0.0}, {'ml_model_13_logit_pnl_avg': nan}] ml_model_13_logit 1660874133347 [{'ml_model_14_rf_pnl_total': 0.0}, {'ml_model_14_rf_pnl_avg': nan}] ml_model_14_rf 1660874133349 [{'ml_model_15_naive_clf_pnl_total': 0.0}, {'ml_model_15_naive_clf_pnl_avg': nan}] ml_model_15_naive_clf 1660874133352 [{'ml_model_16_svc_pnl_total': 0.0}, {'ml_model_16_svc_pnl_avg': nan}] ml_model_16_svc 1660874133354 [{'ml_model_17_decisison_tree_pnl_total': 0.0}, {'ml_model_17_decisison_tree_pnl_avg': nan}] ml_model_17_decisison_tree 1660874133356 [{'ml_model_18_grad_boost_pnl_total': 0.0}, {'ml_model_18_grad_boost_pnl_avg': nan}] ml_model_18_grad_boost
- can you check if
df.pnl.mean()
anddf.pnl.sum()
returns a float and not a single-rowpandas.Series
?If I can reproduce the bug, I will be able to give you a workaround.
Must the keywords for the output be specifically 'step'? This is my current node to return the output.
Yes exactly. That's for consistency between loading and saving metrics.
Replace each entry df.pnl.mean()
by a dict{'step': 0, 'value': df.pnl.mean()}
and you will be fine. This adds an extra nested dict level and is not ideal. I let the issue opened to improve the API in the future.
Hi, I close the issue but feel free to reopen if needed.
Description
This happens when i tried to configure my own metric functions.
Context
I am trying to create a custom metric indicator, to be logged after each experimentation. When i run
kedro mlflow ui
, this is what I'm getting on the UI.Steps to Reproduce
This is my nodes.py
Expected Result
How do i get the metric to be displayed when i use the Mlflow ui? Are there specific keywords that mlflow is tracking to be logged as metric?
Your Environment
Include as many relevant details about the environment in which you experienced the bug:
kedro
andkedro-mlflow
version used (pip show kedro
andpip show kedro-mlflow
): 0.10.0python -V
): 3.9.0