sign-language-processing / segmentation

Sign language pose segmentation model on both the sentence and sign level
MIT License
1 stars 1 forks source link

Generalization to other datasets #1

Closed 74Mazen74 closed 7 months ago

74Mazen74 commented 7 months ago

Concerning Sign Language Segmentation Stage, is the segmentation model general to any sign language? if not, What is the structure needed for the data to train my own segmentation model? Thanks in advance!

AmitMY commented 7 months ago

I believe it works generally, see the paper, section 5.3 - https://aclanthology.org/2023.findings-emnlp.846.pdf If you want to retrain/fine-tune the model, you need sign level segmented data, alongside videos (or poses) (see https://github.com/sign-language-processing/segmentation/blob/main/sign_language_segmentation/src/data.py#L66)

74Mazen74 commented 7 months ago

@AmitMY I really appreciate your response. Can you please provide steps to follow your code to train/fine-tune on my own dataset?

AmitMY commented 7 months ago

For example (taken from here)

python -m sign_language_segmentation.src.train --dataset=mediapi_skel --pose=holistic --fps=0 --hidden_dim=256 --encoder_depth=4 --encoder_bidirectional=true --data_dir=/shares/volk.cl.uzh/zifjia/tensorflow_datasets_2 --wandb_dir=/data/zifjia/sign_language_segmentation_mediapi --seed=$42 --run_name=E4s-lsf-$42 --optical_flow=true --hand_normalization=true

And here: https://github.com/sign-language-processing/segmentation/blob/main/sign_language_segmentation/src/data.py#L231 Implement what happens under your dataset name so it loads your dataset