KichangKim / DeepDanbooru

AI based multi-label girl image classification system, implemented by using TensorFlow.
MIT License
2.65k stars 260 forks source link

Training finishes instantly with used_minibatch=0 and used_sample=0 #47

Closed DHG-Dav closed 2 years ago

DHG-Dav commented 2 years ago

Hello ! It's been a while i'm trying to make this run, i had some external help to solve most problems, but i can't find a solution to this :

Using Adam optimizer ...
Loading tags ...
Creating model (resnet_custom_v3) ...
Model : (None, 299, 299, 3) -> (None, 181)
Loading database ...
No checkpoint. Starting new training ... (2021-12-03 20:13:30.093662)
Shuffling samples (epoch 0) ...
Trying to change learning rate to 0.001 ...
Learning rate is changed to <tf.Variable 'learning_rate:0' shape=() dtype=float32, numpy=0.001> ...
Shuffling samples (epoch 1) ...
Trying to change learning rate to 0.001 ...
Learning rate is changed to <tf.Variable 'learning_rate:0' shape=() dtype=float32, numpy=0.001> ...
Shuffling samples (epoch 2) ...
Trying to change learning rate to 0.001 ...
Learning rate is changed to <tf.Variable 'learning_rate:0' shape=() dtype=float32, numpy=0.001> ...
Shuffling samples (epoch 3) ...
Trying to change learning rate to 0.001 ...
Learning rate is changed to <tf.Variable 'learning_rate:0' shape=() dtype=float32, numpy=0.001> ...
Shuffling samples (epoch 4) ...
Trying to change learning rate to 0.001 ...
Learning rate is changed to <tf.Variable 'learning_rate:0' shape=() dtype=float32, numpy=0.001> ...
Shuffling samples (epoch 5) ...
Trying to change learning rate to 0.001 ...
Learning rate is changed to <tf.Variable 'learning_rate:0' shape=() dtype=float32, numpy=0.001> ...
Shuffling samples (epoch 6) ...
Trying to change learning rate to 0.001 ...
Learning rate is changed to <tf.Variable 'learning_rate:0' shape=() dtype=float32, numpy=0.001> ...
Shuffling samples (epoch 7) ...
Trying to change learning rate to 0.001 ...
Learning rate is changed to <tf.Variable 'learning_rate:0' shape=() dtype=float32, numpy=0.001> ...
Shuffling samples (epoch 8) ...
Trying to change learning rate to 0.001 ...
Learning rate is changed to <tf.Variable 'learning_rate:0' shape=() dtype=float32, numpy=0.001> ...
Shuffling samples (epoch 9) ...
Trying to change learning rate to 0.001 ...
Learning rate is changed to <tf.Variable 'learning_rate:0' shape=() dtype=float32, numpy=0.001> ...
Saving model ... (per epoch {export_model_per_epoch})
A:\ANACONDA\envs\deeptag\lib\site-packages\keras\utils\generic_utils.py:494: CustomMaskWarning: Custom mask layers require a config and must override get_config. When loading, the custom mask layer must be passed to the custom_objects argument.
  warnings.warn('Custom mask layers require a config and must override '
Saving model ...
Training is complete.
used_epoch=10, used_minibatch=0, used_sample=0

All seems to go well, except it doesn't train... is there a verbose mode or a way to find out what's the problem ? the directories have been checked :

├─dataset │ │ test21.sqlite │ │ │ └─images │ └─00 │ 00000747f1d8598b622bd0d7ad364075.png │ [...] └─test17 project.json tags.txt

And my json :

{
    "image_width": 299,
    "image_height": 299,
    "database_path": "A:\\NN\\DeepTag\\dataset\\test21.sqlite",
    "minimum_tag_count": 1,
    "model": "resnet_custom_v3",
    "minibatch_size": 2,
    "epoch_count": 10,
    "export_model_per_epoch": 10,
    "checkpoint_frequency_mb": 200,
    "console_logging_frequency_mb": 10,
    "optimizer": "adam",
    "learning_rate": 0.001,
    "rotation_range": [
        0.0,
        360.0
    ],
    "scale_range": [
        0.9,
        1.1
    ],
    "shift_range": [
        -0.1,
        0.1
    ],
    "mixed_precision": false
}

The tags.txt contains approximately 180 tags that are all in the images (there is no "orphan" tags), all images contains 10~30 tags, and there is 7,7k images (i'm testing it on a reduced dataset first).

Thank you for any help.

PS : here are all the packages and version i have installed in this environment :

_tflow_select             2.3.0                       mkl
abseil-cpp                20210324.2           hd77b12b_0
absl-py                   0.13.0           py39haa95532_0
aiohttp                   3.8.1            py39h2bbff1b_0
aiosignal                 1.2.0              pyhd3eb1b0_0
astor                     0.8.1            py39haa95532_0
astunparse                1.6.3                      py_0
async-timeout             4.0.1              pyhd3eb1b0_0
attrs                     21.2.0             pyhd3eb1b0_0
blas                      1.0                         mkl
blinker                   1.4              py39haa95532_0
brotlipy                  0.7.0           py39h2bbff1b_1003
ca-certificates           2021.10.26           haa95532_2
cachetools                4.2.2              pyhd3eb1b0_0
certifi                   2021.10.8        py39haa95532_0
cffi                      1.15.0           py39h2bbff1b_0
charset-normalizer        2.0.4              pyhd3eb1b0_0
clang                     5.0                      pypi_0    pypi
click                     8.0.3              pyhd3eb1b0_0
colorama                  0.4.4                    pypi_0    pypi
cryptography              3.4.8            py39h71e12ea_0
cycler                    0.11.0                   pypi_0    pypi
dataclasses               0.8                pyh6d0b6a4_7
deepdanbooru              1.0.0                    pypi_0    pypi
flatbuffers               1.12                     pypi_0    pypi
fonttools                 4.28.2                   pypi_0    pypi
frozenlist                1.2.0            py39h2bbff1b_0
gast                      0.4.0              pyhd3eb1b0_0
giflib                    5.2.1                h62dcd97_0
google-auth               1.33.0             pyhd3eb1b0_0
google-auth-oauthlib      0.4.1                      py_2
google-pasta              0.2.0              pyhd3eb1b0_0
grpcio                    1.42.0           py39hc60d5dd_0
h5py                      3.6.0            py39h3de5c98_0
hdf5                      1.10.6               h7ebc959_0
icc_rt                    2019.0.0             h0cc432a_1
icu                       68.1                 h6c2663c_0
idna                      3.2                pyhd3eb1b0_0
imageio                   2.13.1                   pypi_0    pypi
importlib-metadata        4.8.1            py39haa95532_0
intel-openmp              2021.4.0          haa95532_3556
jpeg                      9d                   h2bbff1b_0
keras                     2.6.0                    pypi_0    pypi
keras-preprocessing       1.1.2              pyhd3eb1b0_0
kiwisolver                1.3.2                    pypi_0    pypi
libcurl                   7.78.0               h86230a5_0
libpng                    1.6.37               h2a8f88b_0
libprotobuf               3.14.0               h23ce68f_0
libssh2                   1.9.0                h7a1dbc1_1
markdown                  3.3.4            py39haa95532_0
matplotlib                3.5.0                    pypi_0    pypi
mkl                       2021.4.0           haa95532_640
mkl-service               2.4.0            py39h2bbff1b_0
mkl_fft                   1.3.1            py39h277e83a_0
mkl_random                1.2.2            py39hf11a4ad_0
multidict                 5.1.0            py39h2bbff1b_2
networkx                  2.6.3                    pypi_0    pypi
numpy                     1.21.2           py39hfca59bb_0
numpy-base                1.21.2           py39h0829f74_0
oauthlib                  3.1.1              pyhd3eb1b0_0
openssl                   1.1.1l               h2bbff1b_0
opt_einsum                3.3.0              pyhd3eb1b0_1
packaging                 21.3                     pypi_0    pypi
pillow                    8.4.0                    pypi_0    pypi
pip                       21.2.4           py39haa95532_0
protobuf                  3.14.0           py39hd77b12b_1
pyasn1                    0.4.8              pyhd3eb1b0_0
pyasn1-modules            0.2.8                      py_0
pycparser                 2.21               pyhd3eb1b0_0
pyjwt                     2.1.0            py39haa95532_0
pyopenssl                 21.0.0             pyhd3eb1b0_1
pyparsing                 3.0.6                    pypi_0    pypi
pyreadline                2.1              py39haa95532_1
pysocks                   1.7.1            py39haa95532_0
python                    3.9.7                h6244533_1
python-dateutil           2.8.2                    pypi_0    pypi
pywavelets                1.2.0                    pypi_0    pypi
requests                  2.26.0             pyhd3eb1b0_0
requests-oauthlib         1.3.0                      py_0
rsa                       4.7.2              pyhd3eb1b0_1
scikit-image              0.18.3                   pypi_0    pypi
scipy                     1.7.1            py39hbe87c03_2
setuptools                58.0.4           py39haa95532_0
setuptools-scm            6.3.2                    pypi_0    pypi
six                       1.16.0             pyhd3eb1b0_0
snappy                    1.1.8                h33f27b4_0
sqlite                    3.36.0               h2bbff1b_0
tensorboard               2.6.0                      py_1
tensorboard-data-server   0.6.0            py39haa95532_0
tensorboard-plugin-wit    1.6.0                      py_0
tensorflow                2.6.0           mkl_py39h31650da_0
tensorflow-base           2.6.0           mkl_py39h9201259_0
tensorflow-estimator      2.6.0              pyh7b7c402_0
termcolor                 1.1.0            py39haa95532_1
tifffile                  2021.11.2                pypi_0    pypi
tomli                     1.2.2                    pypi_0    pypi
typing-extensions         3.10.0.2             hd3eb1b0_0
typing_extensions         3.10.0.2           pyh06a4308_0
tzdata                    2021e                hda174b7_0
urllib3                   1.26.7             pyhd3eb1b0_0
vc                        14.2                 h21ff451_1
vs2015_runtime            14.27.29016          h5e58377_2
werkzeug                  2.0.2              pyhd3eb1b0_0
wheel                     0.35.1             pyhd3eb1b0_0
win_inet_pton             1.1.0            py39haa95532_0
wincertstore              0.2              py39haa95532_2
wrapt                     1.12.1           py39h196d8e1_1
yarl                      1.6.3            py39h2bbff1b_0
zipp                      3.6.0              pyhd3eb1b0_0
zlib                      1.2.11               h62dcd97_4
KichangKim commented 2 years ago

You should double back-slash for the database path. (or use slash instead.)

"database_path": "A:\NN\DeepTag\dataset\test21.sqlite", to "database_path": "A:/NN/DeepTag/dataset/test21.sqlite",

DHG-Dav commented 2 years ago

Yes i do it, otherwise i get an error message, however on github when i copypaste, the double back-slash is replaced with a single one. Maybe because i didn't use the "code" thingy ?

{
    "image_width": 299,
    "image_height": 299,
    "database_path": "A:\\NN\\DeepTag\\test1\\test21.sqlite",
    "minimum_tag_count": 15,
    "model": "resnet_custom_v3",
    "minibatch_size": 2,
    "epoch_count": 10,
    "export_model_per_epoch": 10,
    "checkpoint_frequency_mb": 200,
    "console_logging_frequency_mb": 10,
    "optimizer": "adam",
    "learning_rate": 0.001,
    "rotation_range": [
        0.0,
        360.0
    ],
    "scale_range": [
        0.9,
        1.1
    ],
    "shift_range": [
        -0.1,
        0.1
    ],
    "mixed_precision": false
}

In other word the problem doesn't come from this Example, here i remove the double backslash :

Traceback (most recent call last):
  File "A:\ANACONDA\envs\deeptag\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "A:\ANACONDA\envs\deeptag\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "A:\ANACONDA\envs\deeptag\Scripts\deepdanbooru.exe\__main__.py", line 7, in <module>
  File "A:\ANACONDA\envs\deeptag\lib\site-packages\click\core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "A:\ANACONDA\envs\deeptag\lib\site-packages\click\core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "A:\ANACONDA\envs\deeptag\lib\site-packages\click\core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "A:\ANACONDA\envs\deeptag\lib\site-packages\click\core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "A:\ANACONDA\envs\deeptag\lib\site-packages\click\core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "A:\ANACONDA\envs\deeptag\lib\site-packages\deepdanbooru\__main__.py", line 52, in train_project
    dd.commands.train_project(project_path, source_model)
  File "A:\ANACONDA\envs\deeptag\lib\site-packages\deepdanbooru\commands\train_project.py", line 30, in train_project
    project_context = dd.io.deserialize_from_json(project_context_path)
  File "A:\ANACONDA\envs\deeptag\lib\site-packages\deepdanbooru\io\__init__.py", line 13, in deserialize_from_json
    return json.loads(stream.read())
  File "A:\ANACONDA\envs\deeptag\lib\json\__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "A:\ANACONDA\envs\deeptag\lib\json\decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "A:\ANACONDA\envs\deeptag\lib\json\decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid \escape: line 4 column 25 (char 75)

And i get an error message.

But my problem doesn't raise any error, it just doesn't train, doesn't use the images... I just edited my first message with the "code" tags (using ctrl+e) so you don't get confused further.

KichangKim commented 2 years ago

Can you share test21.sqlite?

DHG-Dav commented 2 years ago

shader ? you mean share ? here is a picture : https://i.postimg.cc/mgW7hZwB/004331.png And here is the file itself https://we.tl/t-RO5ALQmiHd

KichangKim commented 2 years ago

I got it. Current DeepDanbooru filters image by its extension, then use only png or jpeg or jpg (its lower-case). But your database contains PNG or JPG.

Change ext to lower case, or modify the source code of DeepDanbooru directly on your local: https://github.com/KichangKim/DeepDanbooru/blob/master/deepdanbooru/data/dataset.py

DHG-Dav commented 2 years ago

OMG that was so trivial and i lost my mind for days on it. Thank you so much, i'll try it right away ! Edit : it works ! after so much frustration and effort it finally work, thank you so much, i can't tell how happy i am right now !