deepfakes / faceswap

Deepfakes Software For All
https://www.faceswap.dev
GNU General Public License v3.0
52.03k stars 13.18k forks source link

Traning crash #1300

Closed RedmondLee closed 1 year ago

RedmondLee commented 1 year ago

Note: For general usage questions and help, please use either our FaceSwap Forum or FaceSwap Discord server. General usage questions are liable to be closed without response.

Crash reports MUST be included when reporting bugs.

Describe the bug Hello everyone, I use a GTX1070 with 8G memory for training, I use the "extract" option to generate two folders of training materials A1 and B2, they are all 128 * 128 in size, then I switch to the training tab, enter the parameters, no matter what trainer I selected, the training will end in a few dozen seconds after the start.

To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Crash Report The crash report generated in the root of your Faceswap folder

01/28/2023 04:18:15 MainProcess     _training                      _base           set_timelapse_feed             DEBUG    Setting preview feed: (side: 'a', images: 498)
01/28/2023 04:18:15 MainProcess     _training                      _base           _load_generator                DEBUG    Loading generator, side: a, is_display: True,  batch_size: 14
01/28/2023 04:18:15 MainProcess     _training                      generator       __init__                       DEBUG    Initializing PreviewDataGenerator: (model: villain, side: a, images: 498 , batch_size: 14, config: {'centering': 'face', 'coverage': 87.5, 'icnr_init': False, 'conv_aware_init': False, 'optimizer': 'adam', 'learning_rate': 5e-05, 'epsilon_exponent': -7, 'autoclip': False, 'reflect_padding': False, 'allow_growth': False, 'mixed_precision': False, 'nan_protection': True, 'convert_batchsize': 16, 'loss_function': 'ssim', 'loss_function_2': 'mse', 'loss_weight_2': 100, 'loss_function_3': None, 'loss_weight_3': 0, 'loss_function_4': None, 'loss_weight_4': 0, 'mask_loss_function': 'mse', 'eye_multiplier': 3, 'mouth_multiplier': 2, 'penalized_mask_loss': True, 'mask_type': 'extended', 'mask_blur_kernel': 3, 'mask_threshold': 4, 'learn_mask': False, 'preview_images': 14, 'zoom_amount': 5, 'rotation_range': 10, 'shift_range': 5, 'flip_chance': 50, 'color_lightness': 30, 'color_ab': 8, 'color_clahe_chance': 50, 'color_clahe_max_size': 4})
01/28/2023 04:18:15 MainProcess     _training                      generator       _get_output_sizes              DEBUG    side: a, model output shapes: [(None, 128, 128, 3), (None, 128, 128, 3)], output sizes: [128]
01/28/2023 04:18:15 MainProcess     _training                      cache           __init__                       DEBUG    Initializing: RingBuffer (batch_size: 14, image_shape: (128, 128, 6), buffer_size: 2, dtype: uint8
01/28/2023 04:18:15 MainProcess     _training                      cache           __init__                       DEBUG    Initialized: RingBuffer
01/28/2023 04:18:15 MainProcess     _training                      generator       __init__                       DEBUG    Initialized PreviewDataGenerator
01/28/2023 04:18:15 MainProcess     _training                      generator       minibatch_ab                   DEBUG    do_shuffle: False
01/28/2023 04:18:15 MainProcess     _training                      multithreading  __init__                       DEBUG    Initializing BackgroundGenerator: (target: '_run_3', thread_count: 1)
01/28/2023 04:18:15 MainProcess     _training                      multithreading  __init__                       DEBUG    Initialized BackgroundGenerator: '_run_3'
01/28/2023 04:18:15 MainProcess     _training                      multithreading  start                          DEBUG    Starting thread(s): '_run_3'
01/28/2023 04:18:15 MainProcess     _training                      multithreading  start                          DEBUG    Starting thread 1 of 1: '_run_3'
01/28/2023 04:18:15 MainProcess     _run_3                         generator       _minibatch                     DEBUG    Loading minibatch generator: (image_count: 498, do_shuffle: False)
01/28/2023 04:18:15 MainProcess     _training                      multithreading  start                          DEBUG    Started all threads '_run_3': 1
01/28/2023 04:18:15 MainProcess     _training                      _base           set_timelapse_feed             DEBUG    Setting preview feed: (side: 'b', images: 167)
01/28/2023 04:18:15 MainProcess     _training                      _base           _load_generator                DEBUG    Loading generator, side: b, is_display: True,  batch_size: 14
01/28/2023 04:18:15 MainProcess     _training                      generator       __init__                       DEBUG    Initializing PreviewDataGenerator: (model: villain, side: b, images: 167 , batch_size: 14, config: {'centering': 'face', 'coverage': 87.5, 'icnr_init': False, 'conv_aware_init': False, 'optimizer': 'adam', 'learning_rate': 5e-05, 'epsilon_exponent': -7, 'autoclip': False, 'reflect_padding': False, 'allow_growth': False, 'mixed_precision': False, 'nan_protection': True, 'convert_batchsize': 16, 'loss_function': 'ssim', 'loss_function_2': 'mse', 'loss_weight_2': 100, 'loss_function_3': None, 'loss_weight_3': 0, 'loss_function_4': None, 'loss_weight_4': 0, 'mask_loss_function': 'mse', 'eye_multiplier': 3, 'mouth_multiplier': 2, 'penalized_mask_loss': True, 'mask_type': 'extended', 'mask_blur_kernel': 3, 'mask_threshold': 4, 'learn_mask': False, 'preview_images': 14, 'zoom_amount': 5, 'rotation_range': 10, 'shift_range': 5, 'flip_chance': 50, 'color_lightness': 30, 'color_ab': 8, 'color_clahe_chance': 50, 'color_clahe_max_size': 4})
01/28/2023 04:18:15 MainProcess     _training                      generator       _get_output_sizes              DEBUG    side: b, model output shapes: [(None, 128, 128, 3), (None, 128, 128, 3)], output sizes: [128]
01/28/2023 04:18:15 MainProcess     _training                      cache           __init__                       DEBUG    Initializing: RingBuffer (batch_size: 14, image_shape: (128, 128, 6), buffer_size: 2, dtype: uint8
01/28/2023 04:18:15 MainProcess     _training                      cache           __init__                       DEBUG    Initialized: RingBuffer
01/28/2023 04:18:15 MainProcess     _training                      generator       __init__                       DEBUG    Initialized PreviewDataGenerator
01/28/2023 04:18:15 MainProcess     _training                      generator       minibatch_ab                   DEBUG    do_shuffle: False
01/28/2023 04:18:15 MainProcess     _training                      multithreading  __init__                       DEBUG    Initializing BackgroundGenerator: (target: '_run_4', thread_count: 1)
01/28/2023 04:18:15 MainProcess     _training                      multithreading  __init__                       DEBUG    Initialized BackgroundGenerator: '_run_4'
01/28/2023 04:18:15 MainProcess     _training                      multithreading  start                          DEBUG    Starting thread(s): '_run_4'
01/28/2023 04:18:15 MainProcess     _training                      multithreading  start                          DEBUG    Starting thread 1 of 1: '_run_4'
01/28/2023 04:18:15 MainProcess     _run_4                         generator       _minibatch                     DEBUG    Loading minibatch generator: (image_count: 167, do_shuffle: False)
01/28/2023 04:18:15 MainProcess     _training                      multithreading  start                          DEBUG    Started all threads '_run_4': 1
01/28/2023 04:18:15 MainProcess     _training                      _base           set_timelapse_feed             DEBUG    Set time-lapse feed: {'a': <generator object BackgroundGenerator.iterator at 0x000001CC7156C2E0>, 'b': <generator object BackgroundGenerator.iterator at 0x000001CC7156C740>}
01/28/2023 04:18:15 MainProcess     _training                      _base           _setup                         DEBUG    Set up time-lapse
01/28/2023 04:18:15 MainProcess     _training                      _base           output_timelapse               DEBUG    Getting time-lapse samples
01/28/2023 04:18:15 MainProcess     _training                      _base           generate_preview               DEBUG    Generating preview (is_timelapse: True)
01/28/2023 04:18:15 MainProcess     _run_3                         multithreading  run                            DEBUG    Error in thread (_run_3): 'NoneType' object is not subscriptable
01/28/2023 04:18:15 MainProcess     _run_4                         multithreading  run                            DEBUG    Error in thread (_run_4): 'NoneType' object is not subscriptable
01/28/2023 04:18:15 MainProcess     _training                      multithreading  check_and_raise_error          DEBUG    Thread error caught: [(<class 'TypeError'>, TypeError("'NoneType' object is not subscriptable"), <traceback object at 0x000001CAD465C2C0>)]
01/28/2023 04:18:15 MainProcess     _training                      multithreading  run                            DEBUG    Error in thread (_training): 'NoneType' object is not subscriptable
01/28/2023 04:18:16 MainProcess     MainThread                     train           _monitor                       DEBUG    Thread error detected
01/28/2023 04:18:16 MainProcess     MainThread                     train           _monitor                       DEBUG    Closed Monitor
01/28/2023 04:18:16 MainProcess     MainThread                     train           _end_thread                    DEBUG    Ending Training thread
01/28/2023 04:18:16 MainProcess     MainThread                     train           _end_thread                    CRITICAL Error caught! Exiting...
01/28/2023 04:18:16 MainProcess     MainThread                     multithreading  join                           DEBUG    Joining Threads: '_training'
01/28/2023 04:18:16 MainProcess     MainThread                     multithreading  join                           DEBUG    Joining Thread: '_training'
01/28/2023 04:18:16 MainProcess     MainThread                     multithreading  join                           ERROR    Caught exception in thread: '_training'
Traceback (most recent call last):
  File "C:\Users\redmond\faceswap\lib\cli\launcher.py", line 230, in execute_script
    process.process()
  File "C:\Users\redmond\faceswap\scripts\train.py", line 213, in process
    self._end_thread(thread, err)
  File "C:\Users\redmond\faceswap\scripts\train.py", line 253, in _end_thread
    thread.join()
  File "C:\Users\redmond\faceswap\lib\multithreading.py", line 217, in join
    raise thread.err[1].with_traceback(thread.err[2])
  File "C:\Users\redmond\faceswap\lib\multithreading.py", line 96, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\redmond\faceswap\scripts\train.py", line 275, in _training
    raise err
  File "C:\Users\redmond\faceswap\scripts\train.py", line 265, in _training
    self._run_training_cycle(model, trainer)
  File "C:\Users\redmond\faceswap\scripts\train.py", line 353, in _run_training_cycle
    trainer.train_one_step(viewer, timelapse)
  File "C:\Users\redmond\faceswap\plugins\train\trainer\_base.py", line 246, in train_one_step
    self._update_viewers(viewer, timelapse_kwargs)
  File "C:\Users\redmond\faceswap\plugins\train\trainer\_base.py", line 354, in _update_viewers
    self._timelapse.output_timelapse(timelapse_kwargs)
  File "C:\Users\redmond\faceswap\plugins\train\trainer\_base.py", line 1070, in output_timelapse
    self._samples.images = self._feeder.generate_preview(is_timelapse=True)
  File "C:\Users\redmond\faceswap\plugins\train\trainer\_base.py", line 510, in generate_preview
    side_feed, side_samples = next(iterator[side])
  File "C:\Users\redmond\faceswap\lib\multithreading.py", line 287, in iterator
    self.check_and_raise_error()
  File "C:\Users\redmond\faceswap\lib\multithreading.py", line 169, in check_and_raise_error
    raise error[1].with_traceback(error[2])
  File "C:\Users\redmond\faceswap\lib\multithreading.py", line 96, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\redmond\faceswap\lib\multithreading.py", line 270, in _run
    for item in self.generator(*self._gen_args, **self._gen_kwargs):
  File "C:\Users\redmond\faceswap\lib\training\generator.py", line 221, in _minibatch
    retval = self._process_batch(img_paths)
  File "C:\Users\redmond\faceswap\lib\training\generator.py", line 334, in _process_batch
    raw_faces, detected_faces = self._get_images_with_meta(filenames)
  File "C:\Users\redmond\faceswap\lib\training\generator.py", line 245, in _get_images_with_meta
    raw_faces = self._face_cache.cache_metadata(filenames)
  File "C:\Users\redmond\faceswap\lib\training\cache.py", line 252, in cache_metadata
    self._validate_version(meta, filename)
  File "C:\Users\redmond\faceswap\lib\training\cache.py", line 312, in _validate_version
    alignment_version = png_meta["source"]["alignments_version"]
TypeError: 'NoneType' object is not subscriptable

============ System Information ============
backend:             nvidia
encoding:            cp936
git_branch:          master
git_commits:         a1ef5ed tests:   - unit test: tools.alignments.media   - Add mypy test   - Typing fixes
gpu_cuda:            11.3
gpu_cudnn:           No global version found. Check Conda packages for Conda cuDNN
gpu_devices:         GPU_0: NVIDIA GeForce GTX 1070
gpu_devices_active:  GPU_0
gpu_driver:          527.56
gpu_vram:            GPU_0: 8192MB (371MB free)
os_machine:          AMD64
os_platform:         Windows-10-10.0.19044-SP0
os_release:          10
py_command:          C:\Users\redmond\faceswap\faceswap.py train -A D:/NetdiskDownload/ds/A1 -B D:/NetdiskDownload/ds/B2 -m D:/NetdiskDownload/ds/C -t villain -bs 1 -it 1000000 -D default -s 250 -ss 25000 -tia D:/NetdiskDownload/ds/A1 -tib D:/NetdiskDownload/ds/B2 -to D:/NetdiskDownload/ds/D -L INFO -gui
py_conda_version:    conda 23.1.0
py_implementation:   CPython
py_version:          3.9.16
py_virtual_env:      True
sys_cores:           16
sys_processor:       AMD64 Family 23 Model 1 Stepping 1, AuthenticAMD
sys_ram:             Total: 16316MB, Available: 5597MB, Used: 10718MB, Free: 5597MB
torzdf commented 1 year ago

You are not training on faceswap extracted faces. Use faceswap extracted faces.

Guides here: https://forum.faceswap.dev/app.php/tag/Guide