marzekan / EC-GAN_NIDS

Application of novel EC-GAN method on Network Intrusion Detection
18 stars 5 forks source link

A Predictions file is missing #1

Open fly2tortoise opened 2 years ago

fly2tortoise commented 2 years ago

Dear Doctor, good afternoon. The Data_sampling file in your package indicates that I am missing a "predictions" file. Could you please tell me how to deal with this bug? Thanks a million. image

marzekan commented 2 years ago

Hi! Thank you for reporting this issue!

There was a problem with importing predictions module, it is fixed now and should work fine after you pull the latest changes. Let me know if it's still not working.

P.S. I'm not a Doc :D

fly2tortoise commented 2 years ago

Dear researcher My previous guess was to import predictionS2 from the Exps package into the data set processing section. Fortunately, the results are consistent with your improved results. However, now I have a new bug that I would like to ask you. When I was training EGan, there was an error when compiling. My Tensorflow version is 2.3, and Numpy version is 1.18. Thanks a million. image Your research is very innovative and has greatly supported my current research. image

marzekan commented 2 years ago

Hello, thanks for the kind words!

Dear researcher My previous guess was to import predictionS2 from the Exps package

You would be right to do so, that was the issue!


Try installing the Tensorflow version "2.5.0" and Numpy version "1.19.5". I was able to replicate your bug by downgrading the TF version to "2.3", so upgrading your TF version is probably the solution.


Also, I added Pipfile and requirements.txt with all dependencies and their correct versions so now you can install all dependencies with pipenv or with pip install -r requirements.txt.

Let me know if any more bugs show up!:)

fly2tortoise commented 2 years ago

Dear researcher Merry Christmas to you. And thank you for the detailed study details in your reply. I take the liberty of asking you about a big problem THAT I have been unable to solve. I don't know if you've tried DCGAN to generate data before. My collaborator has treated UNSW-NB15 with DCGAN,and the classification accuracy can reach 99.5%. However, it is used in extremely unbalanced situations, such as Heartbleed of IDS2017, which has only 11 in the 2 million data, so there is actually a problem here. We use the data generated by the oversampling method, or use GAN to generate 10 The data may not be as accurate as copying it 10 times and then classifying it. But if it is copied for 10 or 100 times, although the classification results are improved, it is not helpful for actual research. There is actually a performance barrier here, and there should be a mathematically relevant content that can be explained. If you have some understanding, can you share and discuss it a bit? Thank you so much image

marzekan commented 2 years ago

Hi, sorry for the long wait.

I haven't tried DCGAN to generate data. I know that Deep Convolutional networks are mostly used for processing images, but because my work regarded tabular data I opted for other approaches.

My collaborator has treated UNSW-NB15 with DCGAN,and the classification accuracy can reach 99.5%.

That is a great result, it would be interesting to see what were the other metrics like F1 score and false positive rate in this case.

As far as my understanding goes, we use GANs and oversampling techniques to generate data similar to the original data but not the same as the original data. This way we try to make our classifiers more robust by not feeding them large numbers of the exact same data over and over.

There is actually a performance barrier here, and there should be a mathematically relevant content that can be explained.

Could you please elaborate this part some more? I'm not sure I understand.