A question of preprocess `ucf-101` 🤗

NUS-HPC-AI-Lab / VideoSys

VideoSys: An easy and efficient system for video generation

Apache License 2.0

1.73k stars 116 forks source link

I have the dataset of ucf-101 and it seems format mismatch with the preprocess.py.

My ucf-101 has 2 folder: (From https://www.crcv.ucf.edu/data/UCF101.php)

The UCF-101

$ tree -L 1 UCF-101/
UCF-101/
├── ApplyEyeMakeup
├── ApplyLipstick
├── Archery
...
├── WritingOnBoard
└── YoYo

And The ucfTrainTestlist

$ tree -L 1 ucfTrainTestlist/
ucfTrainTestlist/
├── classInd.txt
├── testlist01.txt
├── testlist02.txt
├── testlist03.txt
├── trainlist01.txt
├── trainlist02.txt
└── trainlist03.txt

Even I can process them with a script, but

How to deal with that? 🤗❤

import csv def split_by_capital(name): # BoxingPunchingBag -> Boxing Punching Bag new_name = "" for i in range(len(name)): if name[i].isupper() and i != 0: new_name += " " new_name += name[i] return new_name class_d = {} with open("./ucfTrainTestlist/classInd.txt", "r") as f: class_l = f.readlines() for kv in class_l: k, v = kv.strip("\n").split(" ") class_d[k] = v data_l = [] with open("./ucfTrainTestlist/trainlist01.txt", "r") as f: data_l.extend(f.readlines()) with open("./ucfTrainTestlist/trainlist02.txt", "r") as f: data_l.extend(f.readlines()) with open("./ucfTrainTestlist/trainlist03.txt", "r") as f: data_l.extend(f.readlines()) for i in range(len(data_l)): k, v = data_l[i].strip("\n").split(" ") data_l[i] = "./videos/UCF-101/" + k, split_by_capital(class_d[v]) with open("./ucfTrainTestlist/data_index.csv", "w") as f: writer = csv.writer(f) writer.writerows(data_l) print("Finish!")

NUS-HPC-AI-Lab / VideoSys

A question of preprocess `ucf-101` 🤗 #101