Closed sirutBuasai closed 1 year ago
We're not using variables
num_divs
andshow_heatmap
here, safe to remove them?
Yes
Same here, we're not using
features
,plot
,num_divs
, andshow_heatmap
. Do you think it's safe to remove those variables? I intuitively think we can removeplot
,num_divs
, andshow_heatmap
but thefeatures
variable is a bit iffy to me. What did we usefeatures
for and why do we not use them anymore?
Yes all four are safe. Originally "features" was going to be used because the parser that comes with the ExtraSensory dataset returns the data and a list of features as strings, and I had planned on using that list.
Same here, safe to remove
plot
,num_divs
, andshow_heatmap
?
Yes
These variables are unused across multiple functions in this file. Any reason we declare them or are they safe to remove?
Nope, not really any reason, that was part of refactoring the original code. Safe to remove
This utils function is used in both the
reverse_diffusion
function and thereverse_tabular_diffusion
function. Is thereverse_diffusion
function still being used and should it be kept? If so, I will separateutils.noise_estimation_loss
intoutils.diffusion_noise_estimation
andutils.tabular_diffusion_noise_estimation
. If not, can we remove thereverse_diffusion
function altogether?
reverse_categorical_diffusion
seems abandoned because we are now using tabular diffusion with both discrete and continuous features. Should we try to fix this function back to its functional state or should we just remove it? Might be a good idea to fix it back to a functional state just in case but if you think we can remove it I'll remove it in the next commit. Also, if we plan to remove all the purely categorical stuff, we should probably go through theutils.py
file to remove the utility function for that too which is why I think restoringreverse_categorical_diffusion
back to a functional state might be easier.Errors in
reverse_categorical_diffusion
: Theinput_size
argument is missing for theConditionalMultinomialModel
class.The
x_0_continuous
argument is missing for theutils.multinomial_diffusion_noise_estimation
function.The
continuous
argument is missing for theutils.get_discrete_model_output
function.
These last two I think can be answered the same way. This may require some testing and is up to some of your choice, but originally I thought that transitioning to tabular meant modifying the code and model, and that would require the diffusion to always be tabular. Given what Walter said, however, that we can simply comment out the loss of continuous and use only discrete data for a full categorical diffusion, and comment out discrete and use only continuous loss for continuous diffusion, I believe we can keep just the reverse_tabular_diffusion
.
My thinking was to keep the prior code to have the working and have the option of using just this, but noting this change it might be fine just to keep the tabular. This would require some minor commenting out, just changing
to
# multinomial_loss = utils.multinomial_diffusion_noise_estimation(model, batch_x_discrete, batch_x_continuous, diffusion, k, feature_indices)
continuous_loss = utils.noise_estimation_loss(model, batch_x_continuous, batch_x_discrete, feature_indices, k, alphas_bar_sqrt, one_minus_alphas_bar_sqrt, num_steps)
loss = continuous_loss
So I think the options are:
reverse_diffusion
which is just continuous diffusion and reverse_categorical_diffusion
and then keep just tabular diffusion and comment out/add parameter to choose just one or the other. You may also have to change the output in some way and how you use the model as I have done already in here with a flag for continuous data or not
https://github.com/waltergerych/advanced_gen_modeling_mqp_2022_2023/blob/7741e94ecfee1ba51958d092d396c642f0f2dcd1/diffusion_models/diffusion.py#L385reverse_diffusion
and reverse_categorical_diffusion
(and probably rename reverse_diffusion
to reverse_continuous_diffusion
) and fix the errors in the categorical one and also have one for reverse_tabular_diffusion
. Then you would need to separate the utils.noise_estimation_loss
into two separate ones for tabular and continuous and potentially discrete if needed.Replies noted. Changes will be pushed in the upcoming commit.
We're not using variables
num_divs
andshow_heatmap
here, safe to remove them? https://github.com/waltergerych/advanced_gen_modeling_mqp_2022_2023/blob/a6eafa7087f389c52ff73232954e987a6f248801/diffusion_models/diffusion.py#L154Same here, we're not using
features
,plot
,num_divs
, andshow_heatmap
. Do you think it's safe to remove those variables? I intuitively think we can removeplot
,num_divs
, andshow_heatmap
but thefeatures
variable is a bit iffy to me. What did we usefeatures
for and why do we not use them anymore? https://github.com/waltergerych/advanced_gen_modeling_mqp_2022_2023/blob/a6eafa7087f389c52ff73232954e987a6f248801/diffusion_models/diffusion.py#L242Same here, safe to remove
plot
,num_divs
, andshow_heatmap
? https://github.com/waltergerych/advanced_gen_modeling_mqp_2022_2023/blob/a6eafa7087f389c52ff73232954e987a6f248801/diffusion_models/diffusion.py#L315These variables are unused across multiple functions in this file. Any reason we declare them or are they safe to remove? https://github.com/waltergerych/advanced_gen_modeling_mqp_2022_2023/blob/a6eafa7087f389c52ff73232954e987a6f248801/diffusion_models/diffusion.py#L172-L174 https://github.com/waltergerych/advanced_gen_modeling_mqp_2022_2023/blob/a6eafa7087f389c52ff73232954e987a6f248801/diffusion_models/diffusion.py#L263-L267 https://github.com/waltergerych/advanced_gen_modeling_mqp_2022_2023/blob/a6eafa7087f389c52ff73232954e987a6f248801/diffusion_models/diffusion.py#L336-L338
This utils function is used in both the
reverse_diffusion
function and thereverse_tabular_diffusion
function. Is thereverse_diffusion
function still being used and should it be kept? If so, I will separateutils.noise_estimation_loss
intoutils.diffusion_noise_estimation
andutils.tabular_diffusion_noise_estimation
. If not, can we remove thereverse_diffusion
function altogether? https://github.com/waltergerych/advanced_gen_modeling_mqp_2022_2023/blob/a6eafa7087f389c52ff73232954e987a6f248801/diffusion_models/diffusion.py#L202reverse_categorical_diffusion
seems abandoned because we are now using tabular diffusion with both discrete and continuous features. Should we try to fix this function back to its functional state or should we just remove it? Might be a good idea to fix it back to a functional state just in case but if you think we can remove it I'll remove it in the next commit. Also, if we plan to remove all the purely categorical stuff, we should probably go through theutils.py
file to remove the utility function for that too which is why I think restoringreverse_categorical_diffusion
back to a functional state might be easier.Errors in
reverse_categorical_diffusion
: Theinput_size
argument is missing for theConditionalMultinomialModel
class. https://github.com/waltergerych/advanced_gen_modeling_mqp_2022_2023/blob/a6eafa7087f389c52ff73232954e987a6f248801/diffusion_models/diffusion.py#L274The
x_0_continuous
argument is missing for theutils.multinomial_diffusion_noise_estimation
function. https://github.com/waltergerych/advanced_gen_modeling_mqp_2022_2023/blob/a6eafa7087f389c52ff73232954e987a6f248801/diffusion_models/diffusion.py#L294The
continuous
argument is missing for theutils.get_discrete_model_output
function. https://github.com/waltergerych/advanced_gen_modeling_mqp_2022_2023/blob/a6eafa7087f389c52ff73232954e987a6f248801/diffusion_models/diffusion.py#L306