tdspora / syngen

Open-source version of the TDspora synthetic data generation algorithm.
https://tdspora.ai/
GNU General Public License v3.0
17 stars 4 forks source link

Improve the handling of experiment name when sending results to mlflow server #357

Closed tdspora closed 5 months ago

tdspora commented 5 months ago

The syngen code should cover next cases with sending results to mlflow server

  1. In case the 'MLFLOW_EXPERIMENT_NAME' provided and no such an experiment available in the mlflow server: => we will create such an experiment => we send the log with INFO level that the experiment with such a name will be created
  2. In case the 'MLFLOW_EXPERIMENT_NAME' provided and such experiment available in the mlflow server: => we should NOT create such an experiment but put the runs into already available experiment => we send the log with WARNING level with message that the experiment with provided name already existed and new runs will be sent there.
  3. In case the 'MLFLOW_EXPERIMENT_NAME' is not provided and no experiment with the name similar to 'table_name' or 'metadata_path' value available in the mlflow server: => we will create a new experiment with the name based on 'table_name' or 'metadata_path' value => we send the log with WARNING level with the message that the experiment with a name similar to 'table_name' or 'metadata_path' value will be created
  4. In case the 'MLFLOW_EXPERIMENT_NAME' is not provided and experiment with the name similar to 'table_name' or 'metadata_path' value available in the mlflow server: => we will NOT create a new experiment with the name based on 'table_name' or 'metadata_path' value but save the runs in already available experiment. =>we send the log with WARNING level with message that the experiment with the name similar to 'table_name' or 'metadata_path' value already exist and new runs will be sent there.