lingua-libre / SignIt

🌻 Lingua Libre SignIt web-browser extension translates selected word in French Sign Language via an elegant pop up so you learn sign language while reading online.
https://addons.mozilla.org/en-US/firefox/addon/lingua-libre-signit/
MIT License
11 stars 13 forks source link

Refine regex in `function wordToFiles( word )` #51

Open hugolpz opened 5 months ago

hugolpz commented 5 months ago

Need to define issue better. It seems to work as intended within the top right popup. Search of accueil returns video of accueil (lieu). Maybe was just caching issue with the API.

Current intersection from (1) clicked-word via window.select() and (2) list of videos recordings is handled by function wordToFiles( word ).

This function is too restrictive and must be more inclusive.

May need edit 3 background-script.js sections :

Additional cases to handle

Selected word Existing word string Existing filename string
coucou coucou (accrocher le regard) File:LL-Q33302_(fsl)-Hugo_enrésidence-coucou(accrocher_le_regard).webm
accueil accueil (lieu)
accueil
File:LL-Q33302_(fsl)-Hugo_enrésidence-accueil(lieu).webm
File:LL-Q33302_(fsl)-Taliba31-accueil.webm

Screenshot

As of Monday noon , it failed. (Recording on Saturday noon...)

Recorded video file exists, but SignIt don't find it !

In the 4th image, an other video (File:LL-Q33302_(fsl)-Taliba31-accueil.webm) is shown but only it. A second one (File:LL-Q33302_(fsl)-Hugo_en_résidence-accueil_(lieu).webm) should be available.

[EDIT]

As of Monday evening , it works. (Recording on Saturday noon...)

Participative listing

Other cases are observed to not work. You can add them in the table above to ease the work of the developer.

shashankiitbhu commented 3 months ago

@hugolpz Hi, is anyone working on this? If not , Can I work on this issue?

pushkar707 commented 3 months ago

I have solved this to some extent.

I have put a check to find if there's a single word from the complete selection whose record is available.

If there is, we return a record for the first word whose record is there.

if this is okay, I will make a PR.

hugolpz commented 3 months ago

@pushkar707 , this is not okay, this is great Q___Q

Your PR is welcome yes ! 🌻 Let's test it.

pushkar707 commented 3 months ago

Thanks a lot :), I have already made PR #57.

I was not able to link it to the issue. Please check it.

pushkar707 commented 3 months ago

Do you know how we can proceed with this issue? Should we display a carousel of all the available words, if multiple words are selected?

hugolpz commented 3 months ago

@pushkar707 I never though this far to be honest. I took over this extension in a lower point it took be some time to bring it up to current status.

Convention

What I can confirm is we have videos names with the convention of word (clarification), such as :

Note: our videos are French Sign Language for now due to lack of contributors from other countries but the principle is true.

What we want

Given text such as:

hi how are you ?

When double clicking hi, the 3 videos cited above should be available on the video gallery.

I'm not sure the split by space is the best, but it implements a split and collection of words from the split, so it's interesting. 👍🏼 We can build upon your code, split by (, and keep parts[0] if we decide to.

kabir-afk commented 1 month ago

@pushkar707 I never though this far to be honest. I took over this extension in a lower point it took be some time to bring it up to current status.

Convention

What I can confirm is we have videos names with the convention of word (clarification), such as :

  • hi (hello).webm
  • hi (surprise).webm
  • hi (scream).webm

Note: our videos are French Sign Language for now due to lack of contributors from other countries but the principle is true.

What we want

Given text such as:

hi how are you ?

When double clicking hi, the 3 videos cited above should be available on the video gallery.

I'm not sure the split by space is the best, but it implements a split and collection of words from the split, so it's interesting. 👍🏼 We can build upon your code, split by (, and keep parts[0] if we decide to.

You are right, while approach taken by @pushkar707 is ideal , it needs a lil bit of refactoring as it dies down when punctuation marks come in between. For example it'll work for croissant dépassant. But if something like croissant, dépassant, i.e., a croissant with a comma , it doesn't work as there is no croissant, inside records . I'll try to refactor it. Also this will only work in FF for now since records aren't populated properly in sw.js for chrome

hugolpz commented 3 weeks ago

@kabir-afk , regex improvement discussions come here.

hugolpz commented 3 weeks ago

Hello @kabir-afk , I have difficulties to understand the current regex. But simply put, if it split by punctuation except -, then we can close this issue for now.

See also :

kabir-afk commented 3 weeks ago

Naah , I don't think so. The issue states that sometimes even if the files exist ( as indicated by colorThoseWords function ), SignIt is unable to find it. This shows that function is working as expected and like you mentioned before , sparqlSignVideosQuery needs to be worked upon in order to address the issue better.

As far as the above regex is concerned, you are right that it checks for punctuations/non-word characters and then later applies global and insensitive case flags inside the constructor function. We could have also used /w or something like that , but that would have ignored the accented letters as well, which shouldn't be the case , hence the big and complex regex.