thechangelog / nightly

Changelog Nightly unearths the hottest repos on GitHub before they blow up. Subscribe for free. Keep up.
https://changelog.com/nightly
MIT License
209 stars 19 forks source link

non-english repos #31

Closed boyeln closed 5 years ago

boyeln commented 5 years ago

Lately, there has been a lot of non-english repos (like this 17-times champion), mostly chinese from what I can tell. Since I don't speek chinese, these repos are only noise, and makes the newsletter less valuable to me. I usually reed changelog nightly on my phone (on my way to work), hence using google translate is not a viable option.

I don't want to be an advocate to remove repos with non-english readmes, since I believe a lot of cool stuff is happening outside of the english linguistical circles. However, I can immediately think of two ways of making the newsletter more readable to me:

  1. Allow users to subscribe to a english-only newsletter or an interligual version.
  2. Create a seperate section within the cahngelog nightly for non-english repos.

I think it's important to think thoroughly before implementing a solution. In the GitHub environment, non-english repositories can be seen as a minority, and just filtering them out might be a bad decision. Also, this is just a suggestion that would make the newsletter better for ME, and might not apply to everyone else. My suggestions are only ment as a conversation starter, as I think this is a conversation that's needed.

jerodsanto commented 5 years ago

@boyeborg this is something I've considered for a long time. I agree with you that the frequency of non-english repos are on the rise lately.

We already have rudimentary language detection support in the code, so that part is relatively easy on the technical side. (granted, it's only used to not tweet certain repos, which has a low risk failure state so I'm not sure how accurate it is. I'm sure there are jargon-filled enlish-language repos that it also determine are not english).

Both of your suggestions have merit, but I'd lean toward #1 for the best user experience. Another thought I had was to include a "translate" link next to descriptions that are determined to be non-english and link out to google translate with the data pre-populated. What do you think of that idea?

jerodsanto commented 5 years ago

@boyeborg polling the tweeters to see what they think about this. Feel free to vote there as well:

https://twitter.com/changelog/status/1125758373859876865

boyeln commented 5 years ago

@jerodsanto Good thinking with the poll! I also think a translate link would be a great addition.

I have given it some more thought, and I'm not that sure an english-only version is such a great idea after all. I believe that code and programming isn't (and shouldn't) be connected to one single spoken language or country, but rather a global an international way of exchanging thought, ideas and solutions (and even problems). To create a product (changelog nightly english-only edition™) that removes a lot of important attributions to GitHub, and the programming community, just because they aren't in our desired language, is bad. We might say, or think, that by allowing the users to choose to subscribe to a interligual version, we include all repos of all languages, but I don't think this necessarily holds true. A lot of people would just subscribe to the english-only version, and never be exposed of non-english contribution. Hence, I belive we should strive towards helping people understand the non-english repos rather than just filtering them out. Maybe there might be some repos that aren't readable to everyone, but I believe the occasional "noise" is a fair price to pay for the diversity gained by including repos of all languages.

boyeln commented 5 years ago

An example of this is the attention 996.ICU got. It would be very bad to have a changelog nightly version with that filtered out. Although the main repo is in english, I believe a big part of changelog nightly is about picking up trends and upcoming repos (i.e. before they become popular and translated to english), hence including non-english repos makes a lot of sense.

jerodsanto commented 5 years ago

@boyeborg I agree with you and will look in to what it'll take to add translate links for non-english repos ✊

jaraddowning commented 5 years ago

I just wanted to add my $0.02. I came here to request the same as @boyeborg and have repos that are in languages that don't use latin characters filtered out. I see valid points made already and think having a translation available would be the perfect solution (in theory) as the value of the repos I cannot read would not be completely lost. Thanks @jerodsanto for doing what you do. I've got some (read very little) experience in translating and I'll see if I can help.

jerodsanto commented 5 years ago

Ok keep your 👀 peeled tonight for translate links next to non-english repos 🤞

boyeln commented 5 years ago

@jerodsanto: Seems like there are a lot of english repos with translation links. Not that this is a problem for me, since I just don't click the translate-link if it's in english. Perhaps all repost should have translation links? That would make it easy for people who arn't fluent in english to translate english repos as well. It's quite easy to just change language once you are on the google translate website.

Another improvment could be to include the whole readme in the translation, if this is possible? I tried to URL-encode the readme of wepe/efficient-decision-tree-notes and past the whole thing in the google translate URL. It seems to work fine, you can see the result here.

jerodsanto commented 5 years ago

Yeah, I noticed that the whatlanguage gem isn't really doing a great job at detecting english vs non. Passing the entire README to the translator is a cool idea, but my concern is that certain email clients (namely Gmail) will truncate email that is too "long", which I believe they measure by total page weight. Stuffing 20ish readmes into anchor tags might push us over that edge, which isn't something I'd like to deal with.

Having the translation link on every repo isn't a bad compromise, but maybe we need to move it to an icon or something less obtrusive in that case...

boyeln commented 5 years ago

I see the problem with having a too "heavy" emails. The fact that the URLs are shortened to something like https://email.changelog.com/t/... doesn't help?

I agree with having a symbol og icon. Perhaps something like this font awesome icon would suffice? It would be nice if it somehow also indicated that it was a URL. Perhaps the google translate logo would be a better option?

jerodsanto commented 5 years ago

That's a good point on Campaign Monitor rewriting those URLs for us. That definitely helps, but it assumes that we will have rewrites for those in perpetuity, which might not be 100% the case.

I like the idea of using the Google Translate logo, but curious what @codyjames thinks on the matter.

codyjames commented 5 years ago

Maybe a one color versions of the Google Translate logo if we can find a good one.

Also, here is what Noun Project has: https://thenounproject.com/search/?q=translate

jerodsanto commented 5 years ago

I like this one quite a bit:

https://thenounproject.com/search/?q=translate&i=1436334#

However, it is a bit hard to read when it's just a tiny icon like this

2019-05-30 at 12 12 PM
codyjames commented 5 years ago

Yeah, I was worried about that with any of the staggered ones, including the Google Translate icon.

codyjames commented 5 years ago

Maybe something simple like this: https://thenounproject.com/search/?q=translate&i=41100

jerodsanto commented 5 years ago

How does this look?

2019-05-30 at 2 28 PM
codyjames commented 5 years ago

The "A" looks a bit messed up and doesn't quite read as an "A". I think if we clean that up this would be fine.

jerodsanto commented 5 years ago

Okay I cleaned up both glyphs a bit so they're more clear:

2019-05-31 at 9 59 AM

The title and alt attribute are set to 'Translate' so if you're not sure and hover your mouse long enough you'll get the picture.

Also kinda neat (I think): I previously had the fl (from language) param set to auto so that Google Translate will guess what the source is. In addition to that, I changed the tl (to language) param to also be auto (previously was en) so that it will do its best guess of what language you want it translated to. It's hard to test how well that works since for me it always goes to English anyhow...

This might help non-english speakers quickly translate the descriptions to their own locale as well. 🤞

jerodsanto commented 5 years ago

I think I'm pretty happy with this solution for now. Closing ✊