gilikn / FakeOut

Leveraging Out-of-domain Self-supervision for Multi-modal Video Deepfake Detection
https://gilikn.github.io/
Apache License 2.0
28 stars 4 forks source link

Tensorflow/Tensorflow Dataset #3

Open Wertiz opened 1 year ago

Wertiz commented 1 year ago

image

I receive this error when I try to do the inference. The directory mentioned in the error points is actually empty so I have no idea what it worries about. I suspect something is off with the data preparation but I did what was written in the ReadMe. Can you point me in the right direction?

asadkhanek commented 1 year ago

hi i need your help which version of python u can use and operating system and how install this jaxlib==0.1.68+cuda101 in requirements.txt pls i try it fron 2 days

Wertiz commented 1 year ago

Yeah, the requirements file is not build properly. What I did was to install the requirements as they popped up. I think you have to use python 3.7 and then I installed everything that came out. For jaxlib you have to follow the instruction on the page of the package to install it with cuda.

asadkhanek commented 1 year ago

thanks for reply if i can run it successfully. then i can also help u if i able to,

gilikn commented 1 year ago

Hey guys, thanks for bringing up this discussion. I used python 3.6 and CUDA 10.1 on my ubuntu server. I believe using python 3.7 and following the jaxlib documentation on their GitHub repo should work too, as @Wertiz mentioned.

@Wertiz Can you please provide me with the following:

  1. The exact commands, with parameters, you used for the data_preparation.py and inference.py executions?
  2. The tree of directories found under fakeout/data/FaceForensics in your current situation?

It will help me reproduce your problem and help you.

Wertiz commented 1 year ago

I started from scratch and launched the data preparation pipeline. I receive this error:

(fakeout) user@server:/home/alessandro.pianese/FakeOut$ python fakeout/data/data_preparation.py --dataset_name face_forensics --split test --videos_path faceforensics/test/crop/
fakeout/data/FaceForensics
Traceback (most recent call last):
  File "fakeout/data/data_preparation.py", line 100, in <module>
    app.run(main)
  File "/home/.conda/envs/fakeout/lib/python3.7/site-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/home/.conda/envs/fakeout/lib/python3.7/site-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File "fakeout/data/data_preparation.py", line 94, in main
    generate_train_test_list_from_extracted(dataset_directory_path, dataset_directory_name, split)
  File "fakeout/data/data_preparation.py", line 49, in generate_train_test_list_from_extracted
    f"{dataset_directory_path}/tmp/downloads/extracted/ZIP.{dataset_directory_name}_{split}.zip"):
FileNotFoundError: [Errno 2] No such file or directory: 'fakeout/data/FaceForensics/tmp/downloads/extracted/ZIP.FaceForensics_test.zip'

Where should I obtain this file?

asadkhanek commented 1 year ago

i can download this data once it completes i can send you link otherwise you have to download the data set of FaceForensics. they give u google form fill it then they send u download script by follow the script and download data.

Wertiz commented 1 year ago

@gilikn Can you confirm or deny @asadkhanek comment or shed some light on the issue?

gilikn commented 1 year ago

Hey, as @asadkhanek mentioned, you clearly have to download the videos of the desired dataset you want to evaluate on beforehand running FacePipe and FakeOut. Datasets can be obtained from the original sources, e.g., for DFDC you should go here https://ai.facebook.com/datasets/dfdc/ and follow the steps described for downloading the dataset.

Also, for proper evaluation, please use the relevant set (e.g. DFDC test set) as described in the README.

Wertiz commented 1 year ago

As it is written in the readme file

Arrange the videos of the desired dataset in train, val, test dedicated directories. Run our face tracking pipeline as documented in the [FacePipe](https://github.com/gilikn/FacePipe) repo.

I have arranged the test files in a dedicated directory and I ran the tracking pipeline. I have the downloaded dataset already. Do I also have to give the script a zip file of the dataset I have already extracted?

gilikn commented 1 year ago

No zip file is needed, ZIP.FaceForensics_test.zip should be created in the execution of data_preparation.py. Tested the flow once again on my machine and it does work, let me try to investigate the issue with you.

  1. --videos_path parameter expects to get the path to the face tracked mp4 videos obtained by FacePipe. You used faceforensics/test/crop/fakeout/data/FaceForensics as I see in your comment above, make sure it has the mp4 videos in it.

  2. After the failed execution of data_preperation.py, does a directory named tmp appeared under fakeout/data/FaceForensics? If so, which files and directories do you see under fakeout/data/FaceForensics/tmp/downloads/extracted?

  3. Which machine do you use for the execution?

Wertiz commented 1 year ago

So, in order:

  1. my files are in {base_folder}/face_forensics/test. After I run the tracking pipeline, it creates a folder called crops with the crops with mp4 format in it. Is it correct to assume I have to give this folder to the data preparation? or should I place my files in the {base_path}/fakeout/data/FaceForensics folder?
  2. Yes, it does appear. It creates 3 nested folders with this hierarchy tmp/downloads/extracted/ and inside of the extracted folder there are two files, ZIP.FaceForensics_test.zip and ZIP.train_test_split.zip. The ZIP.FaceForensics_test.zip file is a relative link to faceforensics/test but the path is wrong.
  3. It's an ubuntu 20.04 server

I'm starting to think I have to use the {base_path}/fakeout/data/FaceForensics folder to store my dataset, even tho it was not specified in the ReadMe file.

gilikn commented 1 year ago

I think I figured it out, the ln command used as part of the data_preparation.py file behaves differently when a relative path is passed to the --videos_path parameter. Can you please try to provide an absolute path to the videos and let me know if it solved the issue? If so, I recommend using this practice for now and I will provide a more generic solution in an upcoming release.

Wertiz commented 1 year ago

Data preparation worked. I then moved on to launching inference and I got this error.

(fakeout) user@server:/home/FakeOut$ python fakeout/inference.py --checkpoint_path fakeout/checkpoint_fakeout_video_audio_tsm_resnet_x2.pkl --dataset_name face_forensics --use_audio False --mlp_first_layer_size 4096 --num_test_windows 10
2023-02-25 00:15:24.072910: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
W0225 00:15:57.767733 139896311566976 dataset_builder.py:873] Using custom data configuration face_forensics
2023-02-25 00:15:57.780966: W tensorflow/core/platform/cloud/google_auth_provider.cc:184] All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "Not found: Could not locate the credentials file.". Retrieving token from GCE failed with "Failed precondition: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Couldn't resolve host 'metadata'".
I0225 00:15:58.207245 139896311566976 dataset_builder.py:400] Generating dataset deepfake (fakeout/data/FaceForensics/tmp/deepfake/face_forensics/2.0.0)
Downloading and preparing dataset Unknown size (download: Unknown size, generated: Unknown size, total: Unknown size) to fakeout/data/FaceForensics/tmp/deepfake/face_forensics/2.0.0...
Generating splits...:   0%|                                        Traceback (most recent call last):                                                                      | 0/1 [00:00<?, ? splits/s]
  File "/home/.conda/envs/fakeout/lib/python3.7/site-packages/tensorflow_datasets/core/utils/py_utils.py", line 304, in incomplete_dir
    yield tmp_dir
  File "/home/.conda/envs/fakeout/lib/python3.7/site-packages/tensorflow_datasets/core/dataset_builder.py", line 441, in download_and_prepare
    download_config=download_config,
  File "/home/alessandro.pianese/.conda/envs/fakeout/lib/python3.7/site-packages/tensorflow_datasets/core/dataset_builder.py", line 1165, in _download_and_prepare
    leave=False,
  File "/home/.conda/envs/fakeout/lib/python3.7/site-packages/tensorflow_datasets/core/dataset_builder.py", line 1161, in <listcomp>
    ) for split_name, generator in utils.tqdm(
  File "/home/.conda/envs/fakeout/lib/python3.7/site-packages/tensorflow_datasets/core/split_builder.py", line 291, in submit_split_generation
    return self._build_from_generator(**build_kwargs)
  File "/home/.conda/envs/fakeout/lib/python3.7/site-packages/tensorflow_datasets/core/split_builder.py", line 356, in _build_from_generator
    leave=False,
  File "/home/.conda/envs/fakeout/lib/python3.7/site-packages/tqdm/std.py", line 1195, in __iter__
    for obj in iterable:
  File "/home/FakeOut/fakeout/utils/deepfake_dataset.py", line 194, in _generate_examples
    label = int(VIDEO_LABELS_PD[VIDEO_LABELS_PD['filename'] == file_name + '.mp4']['label'].iloc[0])
  File "/home/.conda/envs/fakeout/lib/python3.7/site-packages/pandas/core/indexing.py", line 931, in __getitem__
    return self._getitem_axis(maybe_callable, axis=axis)
  File "/home/.conda/envs/fakeout/lib/python3.7/site-packages/pandas/core/indexing.py", line 1566, in _getitem_axis
    self._validate_integer(key, axis)
  File "/home/.conda/envs/fakeout/lib/python3.7/site-packages/pandas/core/indexing.py", line 1500, in _validate_integer
    raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "fakeout/inference.py", line 192, in <module>
    app.run(main)
  File "/home/.conda/envs/fakeout/lib/python3.7/site-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/home/.conda/envs/fakeout/lib/python3.7/site-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File "fakeout/inference.py", line 137, in main
    train=False)
  File "/home/FakeOut/fakeout/data/data_utils.py", line 67, in generate_dataset
    builder.download_and_prepare(download_config=dl_config)
  File "/home/.conda/envs/fakeout/lib/python3.7/site-packages/tensorflow_datasets/core/dataset_builder.py", line 470, in download_and_prepare
    self.info.write_to_directory(self._data_dir)
  File "/home/.conda/envs/fakeout/lib/python3.7/contextlib.py", line 130, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/.conda/envs/fakeout/lib/python3.7/site-packages/tensorflow_datasets/core/utils/py_utils.py", line 308, in incomplete_dir
    tf.io.gfile.rmtree(tmp_dir)
  File "/home/.conda/envs/fakeout/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py", line 599, in delete_recursively_v2
    _pywrap_file_io.DeleteRecursively(compat.as_bytes(path))
tensorflow.python.framework.errors_impl.FailedPreconditionError: fakeout/data/FaceForensics/tmp/deepfake/face_forensics/2.0.0.incomplete8B6IEG; Directory not empty

Any idea?