Fitting error keeps on repeating

AryanSaeedi commented 10 months ago

Hi, Thank you for sharing the repo. I am writing my master's thesis and I am trying to use your model to generate synthetic network traffic. However, I am facing a problem with the TensorFlow placeholder while trying to fit the model. The error message keeps repeating itself over and over again. Do you have any suggestions on how to maybe fix it? While it does seem like a warning, it keeps on repeating. Thank you, Aryan

AryanSaeedi commented 10 months ago

Never mind, the process starts after a minute of showing those warnings. however, I can't seem to sample I get an error. Could you please let me know what could be the problem?

AryanSaeedi commented 10 months ago

If I set the timeout to False, it starts generating new samples but it takes a lot of time. I want to generate around 41000 instances and it shows that it will take more than two days.

glederrey commented 10 months ago

Hi,

I haven't worked on this for a couple of years. But I will try to help you.

Firstly, were you able to make to run the examples? If no, then it already means that there's an issue while setting up the library. If you could make it work, then it might come from the data and/or the definition of the DAG for the variables.

Did you correctly define the continuous variables as shown in the example?
How did you build the DAG? Did you use the advise function?

In the end, I think that the model is discarding too many samples during the sampling phase. You can set the parameter randomize to False to avoid discarding. But it will most likely give you bad results.

Last piece of advice, I saw that you're training the model on your CPU for 100 epochs. I would strongly advise to use a GPU to speed up the training process and train it for more than 100 epochs.

AryanSaeedi commented 9 months ago

Hi, thank you for the response. I didn't run the examples at first, but after I got your response I tried running the examples but got an error with the version of Numpy while trying to sample. I have tried downgrading Numpy and changing the bit of the code inv(yin) of pynverse package, but I couldn't solve the problem and rather created more. The first error I get is with the OneHotEncolder at the synthesizer.py line 113: _self.onehot = OneHotEncoder(categories=[np.array(self.varorder)], sparse=False) saying that sparse=False has been changed in the newer version of scikitlearn to _sparseoutput. After fixing this, the error that I get is when trying to run the sample command of the example file. Below is the issue that I get. I am not really sure which version of Numpy you were using at the time. I did some deductions based on when you created the repo and downgraded the Numpy version and other related libraries for compatibility issues, my problems were getting worse.

To answer your questions: I didn't define the DAG since I saw I don't really have. For the continuous variables, yes I did define them correctly and compared it with the example. Now, I am working with a GPU and the example file only takes 16 mins to fit.

glederrey commented 9 months ago

I haven't worked on this project for more than 2 years now. So, it's a bit out of date. Sorry about that. Anyway, I managed to run the example using the following procedure:

Install a clean environment with Python 3.7.
Install the datgan and the jupyter modules via pip
Downgrade the protobuf package via pip install protobuf==3.20.0
Delete the file encoded_data.pkl in the folder example/data/encoded_data

After all these steps, I was able to run the full notebook training.ipynb. I haven't checked the quality of the results since I ran it on my laptop and I don't have a GPU (just trained 2 epochs for each model)

If it helps, here's the list of all python libraries and their version used to make it work

Package                      Version
---------------------------- -------------------
absl-py                      2.1.0
anyio                        3.7.1
argon2-cffi                  23.1.0
argon2-cffi-bindings         21.2.0
astunparse                   1.6.3
attrs                        23.2.0
backcall                     0.2.0
beautifulsoup4               4.12.3
bleach                       6.0.0
cachetools                   5.3.2
certifi                      2023.11.17
cffi                         1.15.1
charset-normalizer           3.3.2
comm                         0.1.4
cycler                       0.11.0
datgan                       2.1.10
debugpy                      1.7.0
decorator                    5.1.1
defusedxml                   0.7.1
dill                         0.3.7
entrypoints                  0.4
exceptiongroup               1.2.0
fastjsonschema               2.19.1
flatbuffers                  23.5.26
fonttools                    4.38.0
gast                         0.5.4
google-auth                  2.27.0
google-auth-oauthlib         0.4.6
google-pasta                 0.2.0
grpcio                       1.60.0
h5py                         3.8.0
idna                         3.6
importlib-metadata           6.7.0
importlib-resources          5.12.0
ipykernel                    6.16.2
ipython                      7.33.0
ipython-genutils             0.2.0
ipywidgets                   8.1.1
jedi                         0.19.1
Jinja2                       3.1.3
joblib                       1.3.2
jsonschema                   4.17.3
jupyter                      1.0.0
jupyter_client               7.4.9
jupyter-console              6.6.3
jupyter_core                 4.12.0
jupyter-server               1.24.0
jupyterlab-pygments          0.2.2
jupyterlab-widgets           3.0.9
keras                        2.8.0
Keras-Preprocessing          1.1.2
kiwisolver                   1.4.5
libclang                     16.0.6
lightgbm                     4.3.0
Markdown                     3.4.4
MarkupSafe                   2.1.4
matplotlib                   3.5.3
matplotlib-inline            0.1.6
mistune                      3.0.2
nbclassic                    1.0.0
nbclient                     0.7.4
nbconvert                    7.6.0
nbformat                     5.8.0
nest-asyncio                 1.6.0
networkx                     2.6.3
notebook                     6.5.6
notebook_shim                0.2.3
numpy                        1.21.6
oauthlib                     3.2.2
opt-einsum                   3.3.0
packaging                    23.2
pandas                       1.3.5
pandocfilters                1.5.1
parso                        0.8.3
pexpect                      4.9.0
pickleshare                  0.7.5
Pillow                       9.5.0
pip                          22.3.1
pkgutil_resolve_name         1.3.10
prometheus-client            0.17.1
prompt-toolkit               3.0.42
protobuf                     3.20.0
psutil                       5.9.8
ptyprocess                   0.7.0
pyasn1                       0.5.1
pyasn1-modules               0.3.0
pycparser                    2.21
Pygments                     2.17.2
pynverse                     0.1.4.6
pyparsing                    3.1.1
pyrsistent                   0.19.3
python-dateutil              2.8.2
pytz                         2023.4
pyzmq                        24.0.1
qtconsole                    5.4.4
QtPy                         2.4.1
requests                     2.31.0
requests-oauthlib            1.3.1
rsa                          4.9
scikit-learn                 1.0.2
scipy                        1.7.3
Send2Trash                   1.8.2
setuptools                   65.6.3
six                          1.16.0
sniffio                      1.3.0
soupsieve                    2.4.1
tensorboard                  2.8.0
tensorboard-data-server      0.6.1
tensorboard-plugin-wit       1.8.1
tensorflow                   2.8.0
tensorflow-io-gcs-filesystem 0.34.0
termcolor                    2.3.0
terminado                    0.17.1
tf-estimator-nightly         2.8.0.dev2021122109
threadpoolctl                3.1.0
tinycss2                     1.2.1
tornado                      6.2
tqdm                         4.66.1
traitlets                    5.9.0
typing_extensions            4.7.1
urllib3                      2.0.7
wcwidth                      0.1.9
webencodings                 0.5.1
websocket-client             1.6.1
Werkzeug                     2.2.3
wheel                        0.38.4
widgetsnbextension           4.0.9
wrapt                        1.16.0
zipp                         3.15.0

AryanSaeedi commented 9 months ago

Thank you very much know the example is working. However, when I try to run my dataset I get an error. Do you think it has anything to do with not creating the DAG? I am not initializing a DAG.

Thank you for the previous responses. :)

AryanSaeedi commented 9 months ago

Update: I got it working, I think not initializing the DAG runs into problems. The model now kind of starts training but it doesn't show how much time it would take. I am using a 20GB Nvidia GPU, and it consumes the whole thing. As for the DAG, I have created it so that all of the other features depend on one, sort of like a hedgehog where the hedgehog itself is one feature and the spikes are the rest of the features.

glederrey commented 9 months ago

I don't know how many variables you have in your dataset. But it seems a bit weird that it takes so much time for so few epochs. I think you should compare with the example to analyze why you have such discrepancies. It's a bit difficult to know what's going on with your data since I don't know anything about it.

AryanSaeedi commented 9 months ago

Thank you very much for the response, I have been busy with my thesis. I somehow managed to get to run, but I still stumbled upon the first issue I had. Putting the randomize parameter to False leads to a KeyError 'index', I am not sure if the sample method generates an index or if the original data should have one already.

AryanSaeedi commented 9 months ago

I totally forgot to mention, I am using the CSE-CIC-IDS2018 dataset. It is a network traffic flow dataset and has 80 features with 79 being numeric and only one discrete. The number of instances differs depending on what you want to do, but in the one above it is around 41000 instances.

glederrey / DATGAN

Fitting error keeps on repeating #3