Open willb opened 7 years ago
gotcha, i just kinda went "off the dome" for that one. i'm open to a better idea though.
Hey guys, hope all is well!
Have you considered building a model for language detection? (either human or programming) You could train the model using Twitter or Wikipedia data, both of which are labeled by language. (or maybe through projects mined from GitHub). For users, they could submit some text and get a prediction on the language back.
Thanks, RJ! This is a cool idea and would make a good (and relatively self-contained) demo.
I don't think you'd actually get a lot of benefit from using Spark's prediction and lookup routines with a word2vec model; maybe I'll file a PR when I think of a better alternative exercise!