About .pickle and .npy files

lynoreading commented 4 months ago

Hello, I am planning on reproducing your code and using it as my final exam. About the pickle file. Can you upload the code and process for that? I guess the code is available at https://github.com/after_anon_review. But that is an invalid address.

EoinKenny commented 4 months ago

Hi, no problem I'm happy to help, which pickle file are you talking about?

Do you mean the "latent_g_saved_input" folder and the .pt files?

lynoreading commented 4 months ago

Thank you very much for your answer, as you also mentioned. I also have questions about the "latent_g_saved_input" folder and the .pt files. It's use to produce z like "z = torch.load("data/latent_g_input_saved/incorrectlatent/misclassify" + str(rand_num) + ".pt") ". I thought it should be a Random input for Generator. Why should we use a fixed Input?

In this place you provided the address to train all models. But this address is no longer valid. Can you provide more information on this. Including how to generate "data/distribution_data/X_train_act.npy". I chose your article as the topic for my final exam. The focus of the exam is to test with a different dataset and reproduce the code. Your advice is crucial to me, and again, thank you.

lynoreading commented 4 months ago

As for the. pickle file, I've managed to fix it. Thank you. LOL

EoinKenny commented 4 months ago

Oh that's great you got the pickle file sorted! Let's see about the rest.

So, that "after_anon_review" directory now would just be this

If you are doing a new dataset though, you will have to train it all from scratch unfortunately, there's no great secret to training all the models, it is definitely a pain though to do the IM1 and IM2 metrics, so I recommend avoiding that if you're tight on time. The NN-Dist is easy though, and honestly it works best in my experience for a wide variety of problems.

I also recommend avoiding the CEM paper comparison if possible, as using Keras is also a giant pain in the ass.

"data/distribution_data/X_train_act.npy" would just be generated like so I think:

X_train_act = list()
for i in range(training_data):
        pred, activations = cnn(training_data[i])
        X_train_act.append(activations)

That's literally it I think.

The activations are the ReLU activations in the last layer (before softmax), the X_train means it's the data for the training data.

If you need the code for all this, I can search my hard drive when I get home, but it was years ago and I'm afraid I may have lost the training code for all the different models (there was so much).

I think if you need to reproduce this with a different dataset, your best bet is FashionMNIST, it is so similar to MNIST and won't require a lot of work.

You can find pre-trained GANs to do FashionMNIST online probably, or at least the code to train it which shouldn't take long. The hardest thing will be getting that GAN.

Another annoying this I left out of the code was getting all the "latent_g_input_saved" files. To do this, you have to do a loss like Eq 1 from the paper here

If you struggle let me know

EoinKenny / AAAI-2021

About .pickle and .npy files #3