Concern re: first suggested exercise

ophicleide / ophicleide.github.io

Documentation and related files for the Ophicleide project

https://ophicleide.github.io

3 stars 3 forks source link

Concern re: first suggested exercise #2

Open willb opened 7 years ago

willb commented 7 years ago

I don't think you'd actually get a lot of benefit from using Spark's prediction and lookup routines with a word2vec model; maybe I'll file a PR when I think of a better alternative exercise!

elmiko commented 7 years ago

gotcha, i just kinda went "off the dome" for that one. i'm open to a better idea though.

rnowling commented 7 years ago

Hey guys, hope all is well!

Have you considered building a model for language detection? (either human or programming) You could train the model using Twitter or Wikipedia data, both of which are labeled by language. (or maybe through projects mined from GitHub). For users, they could submit some text and get a prediction on the language back.

willb commented 7 years ago

Thanks, RJ! This is a cool idea and would make a good (and relatively self-contained) demo.