Open L2on1 opened 9 months ago
Hi @aradzie,
I'm trying to add support for the French alphabet âàêïîôù
to keybr.com. Here are the steps I've taken:
1. Added letters to languages.ts:
âàêïîôù
into keybr.com/packages/keybr-phonetic-model/lib/generate/languages.ts
in French alphabet.2. Incorporated high-frequency French words:
frequence.csv
).keybr.com/packages/keybr-content-words/lib/data/words-fr.json
(using an intersection) and keybr.com/packages/keybr-phonetic-model/lib/generate/corpus/words-fr.csv.gz
(without an intersection).3. Issue with character inclusion:
âàêïîôù
characters are not appearing in the lessons.Recompilation and testing steps: I've followed Getting Started to do my try.
npm run compile
npm run build-dev
env DATABASE_CLIENT=sqlite npm test
./packages/devenv/lib/initdb.ts
npm start
Fork I've created a fork here :https://github.com/L2on1/keybr.com. If you want to check.
Request for assistance:
Thank you for your time and support
As @Icarwiz noticed that there is no 'k' and 'w' in French in the #134 issue.
k
and w
into keybr.com/packages/keybr-phonetic-model/lib/generate/languages.ts
in French alphabet. (in my fork : https://github.com/L2on1/keybr.com)It still doesn't work. I still need to figure out how to add it in the app. Maybe something to build ?
Let me take a look at your fork.
I am myself trying to add these two letters 'à'
and 'ù'
from a fresh new list of the 10000 most frequent French words. Here are my observations.
"à"
are: "à"
, "là"
, "déjà"
, "voilà"
, "çà"
, "revoilà"
, "delà"
, "ù"
is: "où"
, just a single word!When I tried to add the letter 'ù'
, it broke the tests. For some reason its frequency is zero. I guess it's a rounding error. I keep looking into it.
However, I successfully added letter "à"
in 54bdc976. I updated the website, so you can test it yourself.
PS I am very sorry for the late reply. I was taking a break from working on this project.
By the way, may I ask you to take a look at the new French word list? Just to make sure that it does not contain any vulgar and obscene words. I am not a French speaker, so it's difficult for me to censor this list. It is enough to check only the first two- three thousand of words.
The list was build by scanning movie subtitles from the Open Subtitles database.
I keep the development of different corpora in a different repository.
Ok, I've rapidly check. It seems okay for the first 1000 words. I've ctrl+f check and i've found these words :
Testicules -> Testicles
Vagin -> Vagina
Cul -> Ass
nazi
violeur -> rapist
viole -> rape
viol
violé
violée
violer
pédophile
pédales -> pedophile
suicid*
--Trigger Warning words --
massacre
assaut
attentat
fusillade -> shooting
Drogue -> drug
Droguer
sanglant ->bloody
violent
is ok because it means violent
as english
Thanks ! For that, buddy !!!
You're right to take breaks ! Hope that others can contribute as well ! I've tried to understand how it works, but I lack understanding of Node.js/TypeScript and other things. I've actually created a Docker for that.
The à
is the most important I think because it is used in most of every sentence.
Can't wait to seeing and contributing to this repository :)
Hi, I'm currently learning to write with French - France - Standard 102.
There are already lessons for 'é', 'è' and 'ç', which is super cool. It would be very nice if you can add 'ù' and especially 'à' which is very common.
Like others said, it would be nice to also train other characters on the keyboard. Be careful about altgr+( = [ for french-azerty. Thanks for keybr.com this is an absolute banger ! <3