Actual behavior

python faceswap.py train -A data/Maciek_broda -B data/Harrison_Ford -m models/ -p --trainer villain --batch-size 8 --save-interval 20 03/26/2019 11:21:22 INFO Log level set to: INFO Using TensorFlow backend. 03/26/2019 11:21:23 INFO Model A Directory: /home/szwank/Desktop/faceswap/data/Maciek_broda 03/26/2019 11:21:23 INFO Model B Directory: /home/szwank/Desktop/faceswap/data/Harrison_Ford 03/26/2019 11:21:23 INFO Training data directory: /home/szwank/Desktop/faceswap/models 03/26/2019 11:21:23 INFO ===================================================================== 03/26/2019 11:21:23 INFO - Using live preview - 03/26/2019 11:21:23 INFO - Press 'ENTER' on the preview window to save and quit - 03/26/2019 11:21:23 INFO - Press 'S' on the preview window to save model weights immediately - 03/26/2019 11:21:23 INFO ===================================================================== 03/26/2019 11:21:25 INFO Loading data, this may take a while... 03/26/2019 11:21:25 INFO Loading Model from Villain plugin... 03/26/2019 11:23:24 INFO Loading config: '/home/szwank/Desktop/faceswap/config/train.ini' 03/26/2019 11:23:25 WARNING No existing state file found. Generating. 03/26/2019 11:23:52 INFO Creating new 'villain' model in folder: '/home/szwank/Desktop/faceswap/models' 03/26/2019 11:24:18 INFO Loading Trainer from Original plugin... 03/26/2019 11:24:32 INFO Enabled TensorBoard Logging 03/26/2019 11:33:32 CRITICAL Error caught! Exiting... 03/26/2019 11:33:32 ERROR Caught exception in thread: 'training_0' 03/26/2019 11:33:35 ERROR Got Exception on main handler: Traceback (most recent call last): File "/home/szwank/Desktop/faceswap/lib/cli.py", line 107, in execute_script process.process() File "/home/szwank/Desktop/faceswap/scripts/train.py", line 101, in process self.end_thread(thread, err) File "/home/szwank/Desktop/faceswap/scripts/train.py", line 126, in end_thread thread.join() File "/home/szwank/Desktop/faceswap/lib/multithreading.py", line 443, in join raise thread.err[1].with_traceback(thread.err[2]) File "/home/szwank/Desktop/faceswap/lib/multithreading.py", line 381, in run self._target(*self._args, *self._kwargs) File "/home/szwank/Desktop/faceswap/scripts/train.py", line 152, in training raise err File "/home/szwank/Desktop/faceswap/scripts/train.py", line 142, in training self.run_training_cycle(model, trainer) File "/home/szwank/Desktop/faceswap/scripts/train.py", line 214, in run_training_cycle trainer.train_one_step(viewer, timelapse) File "/home/szwank/Desktop/faceswap/plugins/train/trainer/_base.py", line 139, in train_one_step loss[side] = batcher.train_one_batch(do_preview) File "/home/szwank/Desktop/faceswap/plugins/train/trainer/_base.py", line 214, in train_one_batch loss = self.model.predictors[self.side].train_on_batch(batch) File "/home/szwank/.conda/envs/deepfake/lib/python3.6/site-packages/keras/engine/training.py", line 1217, in train_on_batch outputs = self.train_function(ins) File "/home/szwank/.conda/envs/deepfake/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2715, in call return self._call(inputs) File "/home/szwank/.conda/envs/deepfake/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2675, in _call fetched = self._callable_fn(*array_vals) File "/home/szwank/.conda/envs/deepfake/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1439, in call run_metadata_ptr) File "/home/szwank/.conda/envs/deepfake/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in exit c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[3,3,128,128] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[{{node training_1/Adam/mul_311}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Adam/beta_1/read, training_1/Adam/Variable_62/read)]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[{{node loss_1/mul/_1601}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_7580_loss_1/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

03/26/2019 11:33:35 CRITICAL An unexpected crash has occurred. Crash report written to /home/szwank/Desktop/faceswap/crash_report.2019.03.26.113332875078.log. Please verify you are running the latest version of faceswap before reporting ^CTraceback (most recent call last): File "faceswap.py", line 36, in ARGUMENTS.func(ARGUMENTS) File "/home/szwank/Desktop/faceswap/lib/cli.py", line 120, in execute_script safe_shutdown() File "/home/szwank/Desktop/faceswap/lib/utils.py", line 209, in safe_shutdown terminate_processes() File "/home/szwank/Desktop/faceswap/lib/multithreading.py", line 488, in terminate_processes process.join() File "/home/szwank/Desktop/faceswap/lib/multithreading.py", line 221, in join if self._result_tokens.get() is None: File "", line 2, in get File "/home/szwank/.conda/envs/deepfake/lib/python3.6/multiprocessing/managers.py", line 757, in _callmethod kind, result = conn.recv() File "/home/szwank/.conda/envs/deepfake/lib/python3.6/multiprocessing/connection.py", line 250, in recv buf = self._recv_bytes() File "/home/szwank/.conda/envs/deepfake/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes buf = self._recv(4) File "/home/szwank/.conda/envs/deepfake/lib/python3.6/multiprocessing/connection.py", line 379, in _recv chunk = read(handle, remaining) KeyboardInterrupt Exception in thread Thread-1: Traceback (most recent call last): File "/home/szwank/.conda/envs/deepfake/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/home/szwank/.conda/envs/deepfake/lib/python3.6/threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "/home/szwank/.conda/envs/deepfake/lib/python3.6/logging/handlers.py", line 1476, in _monitor record = self.dequeue(True) File "/home/szwank/.conda/envs/deepfake/lib/python3.6/logging/handlers.py", line 1425, in dequeue return self.queue.get(block) File "", line 2, in get File "/home/szwank/.conda/envs/deepfake/lib/python3.6/multiprocessing/managers.py", line 757, in _callmethod kind, result = conn.recv() File "/home/szwank/.conda/envs/deepfake/lib/python3.6/multiprocessing/connection.py", line 250, in recv buf = self._recv_bytes() File "/home/szwank/.conda/envs/deepfake/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes buf = self._recv(4) File "/home/szwank/.conda/envs/deepfake/lib/python3.6/multiprocessing/connection.py", line 383, in _recv raise EOFError EOFError

Exception ignored in: <generator object TrainingDataGenerator.minibatch at 0x7f7810b276d0> Traceback (most recent call last): File "/home/szwank/Desktop/faceswap/lib/training_data.py", line 135, in minibatch File "/home/szwank/Desktop/faceswap/lib/multithreading.py", line 43, in exit File "/home/szwank/Desktop/faceswap/lib/multithreading.py", line 35, in free File "/home/szwank/Desktop/faceswap/lib/multithreading.py", line 173, in free File "", line 2, in put File "/home/szwank/.conda/envs/deepfake/lib/python3.6/multiprocessing/managers.py", line 753, in _callmethod File "/home/szwank/.conda/envs/deepfake/lib/python3.6/multiprocessing/managers.py", line 740, in _connect File "/home/szwank/.conda/envs/deepfake/lib/python3.6/multiprocessing/connection.py", line 487, in Client File "/home/szwank/.conda/envs/deepfake/lib/python3.6/multiprocessing/connection.py", line 614, in SocketClient FileNotFoundError: [Errno 2] No such file or directory Exception ignored in: <bound method BaseSession.del of <tensorflow.python.client.session.Session object at 0x7f7810b394e0>> Traceback (most recent call last): File "/home/szwank/.conda/envs/deepfake/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 738, in del TypeError: 'NoneType' object is not callable

Other relevant information

operating system = Ubuntu 18.04.02 Graphic card = GTX 1060 6GB Ram 8GB

------PIP packages------ Package Version

absl-py 0.7.0
astor 0.7.1
certifi 2019.3.9 Click 7.0
cloudpickle 0.8.0
cycler 0.10.0
cytoolz 0.9.0.1 dask 1.1.4
decorator 4.3.2
dlib 19.17.0 face-recognition 1.2.3
face-recognition-models 0.3.0
ffmpy 0.2.2
gast 0.2.2
google-images-download 2.5.0
grpcio 1.16.1
h5py 2.9.0
imageio 2.5.0
Keras 2.2.4
Keras-Applications 1.0.7
Keras-Preprocessing 1.0.9
kiwisolver 1.0.1
Markdown 3.0.1
matplotlib 2.2.2
mkl-fft 1.0.10
mkl-random 1.0.2
mock 2.0.0
networkx 2.2
numpy 1.15.4
nvidia-ml-py3 7.352.0 olefile 0.46
opencv-python 4.0.0.21 pathlib 1.0.1
pbr 5.1.3
Pillow 5.4.1
pip 19.0.3
protobuf 3.6.1
psutil 5.6.1
pyparsing 2.3.1
python-dateutil 2.8.0
pytz 2018.9
PyWavelets 1.0.2
PyYAML 3.13
scikit-image 0.14.2
scikit-learn 0.20.3
scipy 1.2.1
selenium 3.141.0 setuptools 40.8.0
six 1.12.0
tensorboard 1.12.2
tensorflow-estimator 1.13.0
tensorflow-gpu 1.12.0
termcolor 1.1.0
toolz 0.9.0
tornado 6.0.1
tqdm 4.31.1
urllib3 1.24.1
Werkzeug 0.14.1
wheel 0.33.1
-------------Conda packages-----------

packages in environment at /home/szwank/.conda/envs/deepfake:

#

Other information

I have checked photos size. They are equal. Similar issue happens when i try different model.

torzdf commented 5 years ago

OOM = Out of memory. Reduce batchsize or use a different model

Kirin-kun commented 5 years ago

The villain model won't work with a GTX 1060 6Gb.. There's not enough memory for, even with a batch of two. Trust me, I tried.

With the "memory saving gradients" option, it will train. But I guess it will take a lot longer.

deepfakes / faceswap-playground

Unable to train(critical error) #270

Actual behavior

Other relevant information

packages in environment at /home/szwank/.conda/envs/deepfake:

Name Version Build Channel

Other information