FangyunWei / SLRT

259 stars 56 forks source link

process of Dataset generation #34

Open mohiburnabil opened 1 year ago

mohiburnabil commented 1 year ago

dev: data/phoenix-2014t/phoenix-2014t_cleaned.dev test: data/phoenix-2014t/phoenix-2014t_cleaned.test train: data/phoenix-2014t/phoenix-2014t_cleaned.train

What is the process to generate the .test .train .dev files?

2000ZRL commented 11 months ago

Each split file contains a list, which is composed of the information of each data sample: [{'name': xx, 'gloss': 'A B C', 'text': 'a b c', 'num_frames': 100}, {...}, ..., {...}]

duyuankai1992 commented 3 months ago

Excuse me, I want to ask you a question. What is the function of pami0-pami2? {'name': 'train/27January_2013_Sunday_tagesschau-8842', 'signer': 'Signer01', 'gloss': 'BLEIBEN WIND', 'text': 'es bleibt windig .', 'num_frames': 43, 'alignments': {'pami0': '0 406 406 406 406 407 407 408 408 408 2986 2986 2986 2986 2987 2987 2987 2988 2988 2988 2988 2988 2988 2988 2988 2988 2988 2988 2988 2988 2988 2988 2988 2988 3172 3172 3172 3173 3173 3173 3174 3174 3174', 'pami1': '0 0 20 20 20 28 28 3 3 0 0 0 0 0 0 0 0 36 36 12 12 30 0 0 0 0 0 0 0 0 0 0 0 0 14 14 23 23 23 23 23 23 23', 'pami2': '0 0 0 13 13 13 13 13 0 0 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 0 0 0 0 0 0 0 0 0'}, 'sign_features': tensor([])}