Need help on reproducing KDD result

jcbnose commented 6 years ago

Questions: 1) May I know what may have gone wrong? I downloaded the code and ran it as is. I also ran for 1000 Epoch but result was not good at all. I once even ran it for 5000 epoch and got this result: Epoch 4999 | time = 15s | loss gen = 3.2600 | loss enc = 329.4751 | loss dis = 0.3632 Testing : Prec = 0.1924 | Rec = 0.5634 | F1 = 0.2869 2) How come the 2nd run, the loss gen, loss enc were much smaller but F1 got worse? Shouldn't be the lower the losses, the better F1? (given than: list_scores = (1-weight) gen_score + weight dis_score) 3) Why the loss enc always seem so big? Anyway to improve it? Note that I tried to add a layer_3 in encoder and/or to add dropout in encoder and generator but didn't help.

Thanks

houssamzenati commented 6 years ago

Thanks for your interest in our research,

1) It's good you managed to reproduce the results. Do try to replicate them with the same number of random seeds as we did (ref. our prepublished paper for the details) and with the variance as well. 2) We followed the same methodology as the original DAGMM paper (Zong et al, 2018) and DSEBM paper (Zhai et al, 2016). As their code is not open source, we contacted them multiple times via emails to insure our setup was exactly the same. This also includes the model finetuning and hyperparameters search methodology. You may also email them and have their advise. 3) There is no such straightforward link between the training losses and the performances of the model. As a matter of fact, the training losses are adversarial (ref original BiGAN paper) whereas our scoring loss is derived from the AnoGAN paper (which is our main baseline) which are reconstruction losses. 4) For the architecture search, an empirical observation (especially because there are not so many papers on GANs with non image data, and so architectures are scarce) we made was that it was always better to find a GAN architecture, finetune it, and then build on it to find a BiGAN architecture. You may have a look at the architecture details in the appendix of our prepublished paper.

Thanks

jcbnose commented 6 years ago

Thanks for your reply. Really appreciate it. 1a. I put the arguments (including random seeds) on the second line of my original post. I believe I used the same parameters. What I don't understand is why after first 100 epochs the results was good but when it picked up and trained for more epochs it became much worse. 1b. when you said prepublished paper, did you mean this [ https://arxiv.org/abs/1802.06222 ]? if not, how can I access it?

will do
I am interested to know if you can recall your observations during training, how big the losses were and whether they converged, with how many epochs?
thanks for your suggestions.

houssamzenati commented 6 years ago

My pleasure,

1a. To understand this, there are so many possible issues. I don't remember the properties of the dataset. But some categories of "attacks" (which are treated as "normal" in this task -> please see the original papers and ours for the explanation of this choice...) may be more represented in the data than other "attacks". Therefore the model may overfit one or another category when you train it for too long... In general I would really recommend to look at this very dataset properties and the BiGAN/AliGANs training trends in other datasets (e.g image datas) to find possible explanations 1b. Yes

Cool
I don't really remember, but what were your observations on MNIST?
Cool

houssamzenati / Efficient-GAN-Anomaly-Detection

Need help on reproducing KDD result #7