How to adequate categorize and collect Signbank / Corpus Videos for preparing dataset

sign-language-processing / lexicon-induction

Induce a lexicon from a continuous sign language corpus

MIT License

0 stars 0 forks source link

How to adequate categorize and collect Signbank / Corpus Videos for preparing dataset #1

Open rem0g opened 1 month ago

rem0g commented 1 month ago

I have run the interference from ASL signs on NGT videos, and i have to say the model has produced some surprising results.

The model recognized dutch signs and produced glosses from similair ASL signs.

Now I want to prepare dataset from Corpus NGT, signCollect (which consist 4K 60FPS quality) and Signbank.

signCollect already has videos with gloss name, we have about 400 of them with left/mid/right
Signbank
Corpus NGT, I am thinking to segment the videos with annotated EAF files and then export the videos to seperate folders named by gloss And then generate parque files from them, linked to metadata JSON file consisting of gloss ID, frame start/end, frames count and fps

Then i have to run the training script from asl competition.

@AmitMY is the preparation I have in my mind right?

rem0g commented 1 month ago

Oline also has created synthetic dataset of 200 NGT glosses of which consist of videos from avatar that produces signs from different angles.

AmitMY commented 1 month ago

If you would like to train your own identification model, go ahead, that is the process, but it should belong in a different repository - https://github.com/sign-language-processing/recognition for example.

I however don't think it is a great use of your time. Instead, if you want to perform classification, right now, get a vector to represent each sign in your dictionary (using the recognition model before softmax, or CLIP), then you can perform kNN on new videos to find a match. This requires no training, allows you to annotate data on the fly, and will result in visually cluster-able data.

rem0g commented 2 weeks ago

I was sick for a while, but now i'm better.

Can you expand a bit more about how to get vector from a sign? Or all the steps in a more practical way, because I am not really proficient with training model as I am still learning.

Thank you!

AmitMY commented 2 weeks ago

from sign_language_recognition.kaggle_asl_signs import predict
from pose_format import Pose

data_buffer = open("file.pose", "rb").read()
pose = Pose.read(data_buffer)
vector = predict(pose)