dataset preprocessing - Githubissues

polimi-ispl / icpr2020dfdc

Video Face Manipulation Detection Through Ensemble of CNNs

GNU General Public License v3.0

258 stars 100 forks source link

dataset preprocessing #25

Closed zhuzhen1996 closed 3 years ago

zhuzhen1996 commented 3 years ago

Hello, I am trying to reproduce your code recently. I encountered a problem when processing the data set, because my computer is a Windows version, and running make_dataset.sh on git reported an error. Then try to run the index_celebdf.py file by yourself to report an error. Error display: pandas.errors.EmptyDataError: No columns to parse from file. Do I need to preprocess the celeb data set first when running this program? How was the List_of_testing_videos.txt file generated

zhuzhen1996 commented 3 years ago

Whether the data set to be processed needs to be processed in a certain format in advance to generate the file List_of_testing_videos.txt

nicobonne commented 3 years ago

Hi @zhuzhen1996, the scripts we provide under scripts are not supposed to run on Windows, they are bash files. You should be able to run the single python scripts though.

You don't need to preprocess celeb_df v2 before running the indexing script, you just need to download it. If you could provide us the full stack trace of the error we could help you better.

zhuzhen1996 commented 3 years ago

Okay, thank you very much for your help. This is a screenshot of the dataset I downloaded earlier.

Then I followed your instructions and ran the index_celebdf.py code as shown in the figure:

I am a little confused, is this List_of_testing_videos.txt file automatically generated by this python code?

zhuzhen1996 commented 3 years ago

Do I need to move the folder where the data set is located under the project file? I am running the program under the pre-trained model to test it smoothly.

zhuzhen1996 commented 3 years ago

Also, would you like to ask you whether the entire program can run completely under Windows?

nicobonne commented 3 years ago

The file List_of_testing_videos.txt should be present in the Celeb DF v2 zip file, as stated in the official repository. Try to re-download/re-extract the zip if you miss that file. I also noticed that you miss the YouTube-real folder.

Also, would you like to ask you whether the entire program can run completely under Windows?

We didn't test extensively under Windows since we don't have Windows machines, but my guess is it can be run. Every path-related line is managed by Python Path library, which should manage the cross-platform.

zhuzhen1996 commented 3 years ago

Okay, then I will try to download it again and have a try. Thank you very much！

zhuzhen1996 commented 3 years ago

Do you have a download link for the DFDC data set? I searched on Google and can’t download anymore

nicobonne commented 3 years ago

Read the readme in the official repo I linked before, you have to send a request to them and then they send you the download link.

nicobonne commented 3 years ago

Do you have a download link for the DFDC data set? I searched on Google and can’t download anymore

The DFDC can be downloaded from here

zhuzhen1996 commented 3 years ago

Okay, thank you very much for your patient guidance! I have questions in the future, hope you can help me

nicobonne commented 3 years ago

Feel free to open other issues if you encounter different problems