get-alex / alex

Catch insensitive, inconsiderate writing
https://alexjs.com
MIT License
4.81k stars 207 forks source link

Internationalisation -- Help with Spanish #212

Open CRomano31415 opened 6 years ago

CRomano31415 commented 6 years ago

This issue stems from the Inernationalisation issue #202

Currently working on:

Current challenges:

wooorm commented 6 years ago

@CRomano31415 Nice! ๐ŸŽ‰

Gathering list of words in Spanish, to be placed in sp-json file, get one started so we can modify!

Push early so people can help ๐Ÿ‘

how to tag lists of words as there is considerable difference in meaning based on geographical areas, ideas?

Depends, are you working on profanities or equality?

CRomano31415 commented 6 years ago

Profanities.

If anyone else can push before me, go for it - currently at work and won't be able to push for a few hours.

wooorm commented 6 years ago

In case of profanities, maybe thereโ€™s some lists like this already out there!

baezor commented 6 years ago

Which is the status? How can I help? @CRomano31415

CRomano31415 commented 6 years ago

I've found some lists in discussion boards, there are no official ones that I have been able to find. But you can start putting it together.

MrBenJ commented 6 years ago

Que paso? Se habla espanol! Puedo ayudar.

What's up? I know how to speak Spanish. I can help. I know a few inappropriate Spanish words and can contribute. Is it in the sp-json file?

MrBenJ commented 6 years ago

@CRomano31415 - I think the international 2 character code for Spanish is es, not sp. Do you think the file should be named es-json instead?

http://www.loc.gov/standards/iso639-2/php/code_list.php

CRomano31415 commented 6 years ago

@MrBenJ Sounds good, do you want to start it? I have a huge list of words that I'm cleaning up but wasn't sure how to code them since some are not profane in some countries, but then I realized if it's profane to some, it should be flagged as profane so I'm almost done cleaning up. If you upload a json, I can add to it. I was looking at fr-json and I wasn't sure what the numbers next to the words were, do you know?

MrBenJ commented 6 years ago

@CRomano31415 - Sure thing. I'd be happy to start it off.

Where are you finding es-json or fr-json ? I'm new to this project and don't know where these files are. Let me know and I'll get started right away :)

CRomano31415 commented 6 years ago

Well, that is a great question, because I could have sworn I saw fr-json not too long ago in one of the remarks... This issue points to it, but now I can't find it either.

MrBenJ commented 6 years ago

Ahh! It looks like that PR hasn't been merged into words/cuss yet. Looking at what @wooorm has commented with on the issue you're referencing, I'll go ahead and follow the same naming conventions they're asking for of the other contributor. We can work on this PR together and get this moving :). Happy Hacktoberfest!

CRomano31415 commented 6 years ago

Ok @MrBenJ here's the list We need help with ppl picking the profane terms (not all of them belong in profanity - some may belong elsewhere) and adding them to a json (in the right location - not sure where that is at).

MrBenJ commented 6 years ago

Hey @CRomano31415 - I wrote a little script to JSONify your list and added a couple of words that weren't on there. I made a pull request to your own repo here:

https://github.com/CRomano31415/SpanishProfanity/pull/1

In the comments, I mentioned that I set everything to a value of 2 for the time being.

My coworker @fportela-ns actually helped out a little bit with some insults that were missing as well.

CRomano31415 commented 6 years ago

Awesome!! I have no idea where to put that JSON file here, maybe @wooorm can guide us? Titus, we have a JSON file with spanish profanity, which repo do we add it to?

MrBenJ commented 6 years ago

Hey @CRomano31415 - My coworker @fportela-ns wants to help out and he's a native Spanish speaker. We found a few words in there that don't quite belong chancla is similar to sandal and aguacates is @fportela-ns 's favorite food. He's gonna fork that repo and start working on it

MrBenJ commented 6 years ago

Also, as a fun side note, how many times in one's life can you say, "I have a file with Spanish profanity! Where do I put it!?" I'm giggling over here.

CRomano31415 commented 6 years ago

I feel pretty good that my contribution to society today has been a list of profane words. Yes, please @fportela-ns pull and fix. To be transparent, I just kind of scraped the web and cleaned it up LOL I didn't quite invest the time in modifying it. And also some words have different meaning in different countries so...idk what to do about those...

fportela-ns commented 6 years ago

hey @CRomano31415 have the PR ready ๐Ÿ˜„ honestly there are a LOT of words that I've no idea what they mean haha. But here's my contribution https://github.com/CRomano31415/SpanishProfanity/pull/2.

I just removed a few words that I consider can't be profane in any way, and marked as 1 words that are profane but only depending on the context

wooorm commented 6 years ago

@fportela-ns You mean @CRomano31415 I think ๐Ÿ˜‰

This looks like a really good start! Awesome work all! How do you all want to proceed? How can I help?

MrBenJ commented 6 years ago

@fportela-ns @wooorm @CRomano31415

I think the next best course of action is to grab @CRomano31415 's file and make a proper pull request to this repo: https://github.com/words/cuss

I've already submitted a PR to get-alex/alex and I'd like @CRomano31415 and @fportela-ns to get some credit for contributing to this project, so I'll let them go ahead and make the appropriate PR's to get credit/contribution/hacktoberfest points :).

AhmedRedaAmin commented 6 years ago

@MrBenJ Hey ! can you pass over the jsonify script you used on the list ? I will very much need it actually :'D I am currently using Claudia's template and manually adding them to the json file seems incredibly ludicrous .

MrBenJ commented 6 years ago

Hey @AhmedRedaAmin

Sure thing. I whipped up an NPM package for you as a CLI tool: https://www.npmjs.com/package/wordlist-to-json

Here's the git repo if you feel like contributing :) https://github.com/MrBenJ/wordlist-to-json

You can install it globally with:

npm install -g wordlist-to-json

And use it like this:

wordlist-to-json --file my_wordlist.txt --space 2

Hope you like it :)

AhmedRedaAmin commented 6 years ago

Thanks @MrBenJ , that is AWESOME ! I 'll definitely look into that repo as soon as I finish what's on my hands currently , that thing is a big help mate , thanks again .

wooorm commented 6 years ago

Soo, we landed Spanish in cuss! Thanks @CRomano31415, @baezor, @MrBenJ, and @fportela-ns for working on this! ๐ŸŽ‰

The next step is to add something for Spanish to retext-equality. Anyone able and interested to work on that?

MrBenJ commented 6 years ago

Hey @wooorm - I have some time. I should be able to help out on retext-equality. Might not get to it today, but probably by the end of the week I can lend a hand.

wooorm commented 6 years ago

@MrBenJ Awesome! And thatโ€™s fine, take your time! :)

Leodau commented 6 years ago

Hey all, just added some ES Venezuelan dialect profanities into cuss ๐Ÿ‘ฎ

feat(Profanity_ES): Adding Venezuelan dialect profanities.

There is indeed a problem with all the ES profanity lists i find on the net, since ES is so geo sensitive, the only real way into solving this is actually trying to not add any word that could be used usually by public speakers, teachers, tv.. and that are correctly defined in the dictionary, even though they might be used by some dialect as profanities.