pdrhlik / sweary

R package that collects swear words from different languages.
MIT License
19 stars 8 forks source link

Added french swear words used in Québec, Canada. #29

Closed desautm closed 6 years ago

desautm commented 6 years ago

Hi,

I just added some swear words in french that are used in Québec, Canada. If someone from France add swear words, they will be different.

Good day.

Marc-André

pdrhlik commented 6 years ago

Hi Marc-André, thanks for the Canadian French!

Do you think that the two differ that much? From my point of view, its the same language but it differs in the dialect. Therefore I would include both Québec and France French into one file.

Does it mean that a swearword in Québec is not necessarily a swearword in France? I don't really know. Could you tell us? If that's the case, it might be related to #1. Or if not, we might start thinking about creating dialects for different languages. Based on the countries the languages are used in.

desautm commented 6 years ago

Hi,

Yes the swear words from France and Québec differ. Just like you said, I would include both in your package.

Most swear word from Québec are not swear words in France and vice and versa. You could create dialects in the future, it could lead to some interesting analysis.

pdrhlik commented 6 years ago

We've discussed it on gitter and a nice obvious solution is using IETF language tag.

To be consistent with some standards, we should probably adopt the IETF language tag. It is comprised of the two letter ISO 639-1 for the language and ISO 3166-1 for the country. Canadian French would then become fr-CA.

Dialect codes summary: http://www.lingoes.net/en/translator/langcode.htm

I'll add a new commit to the PR. You seem to have a different version of {roxygen2} and probably even {rmarkdown} and it kind of messed up the README.md file. I'll also delete the last "word" from your list. That's because it's a multiword phrase that we don't support for now. I'm thinking about adding that in the future once we are confident in doing the words :-)

desautm commented 6 years ago

It seems like a great solution. Sorry for messing up the README file!

pdrhlik commented 6 years ago

No worries! I haven't thought about different versions of {roxygen2} yet so at least I'll try to figure out how to deal with this properly :-)