Open pnimeesha opened 1 month ago
Hi,
The labels are part of the training and sampling process, as shown in the google colab. Below is the sampling step where the label is fed into the model.
These labels will be associated with the generated samples. So you only need to concatenate generated samples with these labels if you want to have them in a single dataframe. Hope that helps.
Best regards, Timur
Hi,
Thank you so much for the clarification! I have couple of more questions:
Thank you!
Hi,
I ran the sample code from the google colab here . The samples generated from the diffusion model do not have labels. Considering the credit card data in this case (as used in colab code), the label column refers to 'default payment next month'. So how can I run the Machine Learning efficacy evaluation metrics (referred to as utility in the paper for which code not available in colab) when the models you mentioned for the evaluation are supervised models (Random Forest, Decision Trees, Logistic Regression, Ada Boost, and Naive Bayes.). I wrote code for Utility and tried to test it. I realised label column is missing for synthetic data. Can you please let me know how this can be done without labels in the synthetic data?
Thanks in advance!