sign / translate

Effortless Real-Time Sign Language Translation
https://sign.mt
Other
436 stars 77 forks source link

[Feature] <Sign language text to Spoken language text> #107

Closed shahzeen031 closed 11 months ago

shahzeen031 commented 11 months ago

Problem

What model is used for converting the gestures into words and then converting those written words into English sentences? where can I find that particular model for Sign language text input and generate spoken language text output?

Description

No response

Alternatives

No response

Additional context

No response

AmitMY commented 11 months ago

Currently, the application does not include any sign language video to SignWriting automatic transcription. We do have a model that can convert SignWriting to spoken language text. Is this what you are after?

shahzeen031 commented 11 months ago

yes, I am looking for a model that can convert SignWriting to spoken language text. where can I find it?

AmitMY commented 11 months ago

Models, instructions, and checkpoints are available at https://github.com/J22Melody/signwriting-translation based on this paper https://arxiv.org/abs/2210.05404

Is that all you were looking for?

shahzeen031 commented 11 months ago

yes, but it has all the models used in the repo, I only need the model that can convert the sign language text to an English language test, for example, string query "your name" result "what is your name?"

AmitMY commented 11 months ago

I think what you are looking for is gloss-to-text models, and not SignWriting-to-text models.

Relevant background can be found here: https://research.sign.mt/#gloss-to-text And I have no model to give you that does that and is available.

shahzeen031 commented 11 months ago

from this link how can i run this API from my project? curl "https://sign.mt/api/spoken-to-signed?from=en&to=ase&text=test" when I run it, it gives an error { "message": "Permission 'iam.serviceAccounts.signBlob' denied on resource (or it may not exist).", "executionId": "vrxkqgd9kzx1" }

AmitMY commented 11 months ago

For the moment, that API is not functional. If you want to translate SignWriting (for example, M518x518S10620487x465S2ff00482x483S21700497x464) to spoken language text you will need to use the models I linked from J22Melody

priyanshmathur commented 11 months ago

@AmitMY I had a similar question. I wanted to know if you know about, converting words mapped from glosses of sign gestures to complete sentence. Just like the example you had in your readme.md .

Also I am not referring to Signwriting as it is beyond what I need.

AmitMY commented 11 months ago

The example in the README is using SignWriting - the idea there is to segment a sentence into signs, then to transcribe each sign into signwriting, then the translation model uses signwriting

priyanshmathur commented 11 months ago

Certainly! I understand. I'm curious about the dataset you utilized to extract the sentences that are then divided into signs. I'm searching for a straightforward NLP model capable of constructing full sentences when provided with a group of words. I'm interested in discovering whether it's feasible to input a set of words, possibly in a jumbled order, and have the model output the most likely sentences these words might be referencing. @AmitMY

AmitMY commented 11 months ago

Training such a model is straightforward (code in appendix) https://arxiv.org/pdf/2105.07476.pdf However, if that was the problem, I would just use a large language model, condition it on the relevant jumbled words and ask it to generate a sentence.

However: gloss translation is not sign language translation, and can not scale to real-life signing.

For a long list of datasets, see here: https://research.sign.mt/#resources

(closing this issue since it is losing relevance to this repository)