Dataset used for training

SamHSlva commented 3 years ago

Hi, thank you for open sourcing your code. I've read your paper and I really appreciate the attention based data augmentation implemented for the deep fake detection. I do have one question though, in the paper you've mentioned that the FF++ HQ was used for training. And from the FF++ download, I see we have the folders c23 as being the HQ compression mode. While investigating the json file to replace with the folders in my own storage, I've noticed all your references are to the c0 folder, which correspond to the raw files. For the training, have you used the c0 or the c23 folders?

yoctta commented 3 years ago

Hello , We use the face detector on c0 videos for getting the face landmarks and use the same landmarks to preprocess c23 ,c40 videos for convenient.

From: SamHSlva Sent: Wednesday, August 4, 2021 2:01 AM To: yoctta/multiple-attention Cc: Subscribed Subject: [yoctta/multiple-attention] Dataset used for training (#7)

Hi, thank you for open sourcing your code. I've read your paper and I really appreciate the attention based data augmentation implemented for the deep fake detection. I do have one question though, in the paper you've mentioned that the FF++ HQ was used for training. And from the FF++ download, I see we have the folders c23 as being the HQ compression mode. While investigating the json file to replace with the folders in my own storage, I've noticed all your references are to the c0 folder, which correspond to the raw files. For the training, have you used the c0 or the c23 folders? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

SamHSlva commented 3 years ago

Thank you for the reply. I've downloaded the files for pre-processing from the G-Drive in your description. I believe I must be making a wrong step while following the instructions you've provided in the readme. (I've pasted the whole content of the preprocessing folder to the folder from your multi-attentional code.

check datasets.json ,replace the paths with your dirs. ->

I've run my own script to rename the file paths in your json file, and they are now like this:
`video_path:"~/Projects/Datasets/Celeb/ff++_train/NeuralTextures/c23/videos/440_364.mp4"
imgs_path:"~/Projects/Datasets/Celeb/ff++_train/faces/ffpp-faces/NeuralTextures/c23/larger_images/440_364"`

run python master_crop_ff_pairs.py.

Originally I had to change line 25, renaming FF-origin-checked.pkl to FF-checked.pkl
No other change I run it, and I get a message saying it is running on: 0.0.0.0:10087
An HTTP access to the server gives me a series of bad characters, but not any useful info. ( I left it running in the background)
1. edit slave_crop.sh, then you can run slave_crop.sh distributed
I've made no change in slave_crop.sh as I just have one node to run the pre-processing.
When I ran it, my first issue slave_crop.py -> face_utils.py -> from models.retinaface import RetinaFace, this model is not provided in your current repository. I grabbed it from your kaggle-dfdc repository.
The same for import data.config import cfg_re50 and all other missing files.
After I can finally run it, I have a series of communication logs in the terminal running the server. But nothing happens in the client, it just ends.

(4. when preprocessing is finished, terminate master_crop_ff_pairs.py)

I was wondering if you could give me some hints on how to debug this issue or the possible causes of the problem.

yoctta commented 3 years ago

• Retinaface is not really needed because I’ve released the pre-computed landmarks, you can edit the script to comment it. The server will dispatch jobs to one or multiple clients and show the progress. Finally the face images will be stored in imgs_path. “An HTTP access to the server gives me a series of bad characters, but not any useful info. ( I left it running in the background)” the error information may help me locate the problem.

From: SamHSlva Sent: Thursday, August 5, 2021 4:01 AM To: yoctta/multiple-attention Cc: yoctta; Comment Subject: Re: [yoctta/multiple-attention] Dataset used for training (#7)

Thank you for the reply. I've downloaded the files for pre-processing from the G-Drive in your description. I believe I must be making a wrong step while following the instructions you've provided in the readme. (I've pasted the whole content of the preprocessing folder to the folder from your multi-attentional code.

check datasets.json ,replace the paths with your dirs. -> • I've run my own script to rename the file paths in your json file, and they are now like this: • video_path:"~/Projects/Datasets/Celeb/ff++_train/NeuralTextures/c23/videos/440_364.mp4" • imgs_path:"~/Projects/Datasets/Celeb/ff++_train/faces/ffpp-faces/NeuralTextures/c23/larger_images/440_364"
run python master_crop_ff_pairs.py. • Originally I had to change line 25, renaming FF-origin-checked.pkl to FF-checked.pkl • No other change I run it, and I get a message saying it is running on: 0.0.0.0:10087 • • An HTTP access to the server gives me a series of bad characters, but not any useful info. ( I left it running in the background) •
edit slave_crop.sh, then you can run slave_crop.sh distributed • I've made no change in slave_crop.sh as I just have one node to run the pre-processing. • When I ran it, my first issue slave_crop.py -> face_utils.py -> from models.retinaface import RetinaFace, this model is not provided in your current repository. I grabbed it from your kaggle-dfdc repository. • • The same for import data.config import cfg_re50 and all other missing files. • • After I can finally run it, I have a series of communication logs in the terminal running the server. But nothing happens in the client, it just ends. • (4. when preprocessing is finished, terminate master_crop_ff_pairs.py) I was wondering if you could give me some hints on how to debug this issue or the possible causes of the problem. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

SamHSlva commented 3 years ago

Hi,

I was able to fix the pre-processing issue I was having. It had to do with the location of my .json file. I was wondering, do you have a tutorial of the steps I should follow to reproduce your results? Currently, I've tried running your code, but I feel I am having to guess a series of steps, and it is quite hard.

If I wanna run the training process in a single machine with 2 GPU's, what should be the steps to take?

Thank you.

wolfman-numba1 commented 2 years ago

• Retinaface is not really needed because I’ve released the pre-computed landmarks, you can edit the script to comment it. The server will dispatch jobs to one or multiple clients and show the progress. Finally the face images will be stored in imgs_path.

Hi you mentioned that RetinaFace is not needed as you've already stored the pre-computed landmarks. I assume this is in the FF-checked.pkl file? I have also been attempting to implement your code base for some of my own work and the preprocessing has been a real issue. I've altered my paths in the dataset.json file to be the following:

video_paths: /Volumes/My1TBDrive/FF++/NeuralTextures/c0/videos/440_364.mp4 imgs_paths: /Volumes/My1TBDrive/FF++/Images/NeuralTextures/c0/larger_images/440_364"

I am finding myself quite confused by the current implementation for pre-processing. As per the instructions I altered the .json file to be my own paths. Some of the questions I have include:

I only have a small subset of the FF++ dataset (about 100 videos) as I want to run the code before using the entire dataset. Will the code be able to skip videos that don't exist in my directories?
I have commented out the import statements in "face.utils" that are related to RetinaFace. Will the code base automatically start using the FF-checked.pkl file as you mentioned above as it has the stored landmarks?

Thanks for any clarification you can provide. Looking forward to hearing from you.

wolfman-numba1 commented 2 years ago

Hi,

I was able to fix the pre-processing issue I was having. It had to do with the location of my .json file. I was wondering, do you have a tutorial of the steps I should follow to reproduce your results? Currently, I've tried running your code, but I feel I am having to guess a series of steps, and it is quite hard.

If I wanna run the training process in a single machine with 2 GPU's, what should be the steps to take?

Thank you.

Hey, I am also attempting to run this code base. I know this is a long shot but it is very confusing at the moment...and i thought that maybe I could get into contact with you for some guidance since you've been working with it a bit longer than I have. Let me know what you think

daybrak commented 5 months ago

感谢您的回复。我已从您的描述中的 G-Drive 下载了用于预处理的文件。我相信在按照您在自述文件中提供的说明进行操作时，我一定走错了一步。（我已将预处理文件夹的全部内容粘贴到您的多注意力代码中的文件夹中。

选中datasets.json，将路径替换为您的目录>。

我已经运行了自己的脚本来重命名 json 文件中的文件路径，它们现在如下所示：

'video_path：“~/Projects/Datasets/Celeb/ff++_train/NeuralTextures/c23/videos/440_364.mp4”

imgs_path：“~/Projects/Datasets/Celeb/ff++_train/faces/ffpp-faces/NeuralTextures/c23/larger_images/440_364”'

运行 Python master_crop_ff_pairs.py。
最初我不得不更改第 25 行，重命名为FF-origin-checked.pkl``FF-checked.pkl
No other change I run it, and I get a message saying it is running on: 0.0.0.0:10087
An HTTP access to the server gives me a series of bad characters, but not any useful info. ( I left it running in the background)
编辑slave_crop.sh，然后就可以运行slave_crop.sh分布式
我没有对slave_crop.sh进行任何更改，因为我只有一个节点来运行预处理。
When I ran it, my first issue slave_crop.py -> face_utils.py -> `from models.retinaface import RetinaFace`, this model is not provided in your current repository. I grabbed it from your kaggle-dfdc repository.
The same for `import data.config import cfg_re50` and all other missing files.
After I can finally run it, I have a series of communication logs in the terminal running the server. But nothing happens in the client, it just ends. 
（4. 预处理完成后，终止master_crop_ff_pairs.py）

我想知道您是否可以给我一些关于如何调试此问题或问题的可能原因的提示。

Hello, I am running this code and encountered the same problem as you, do you have any solution?

yoctta / multiple-attention

Dataset used for training #7