SteamedPears / CodeReviewClientMaster

Other
4 stars 3 forks source link

Language Auto-detection #18

Open spratt opened 12 years ago

spratt commented 12 years ago

As the user enters code in the code submission window, we should make a reasonable guess at which language they are using, but still let the user pick their language, and stop trying to auto-detect once they've chosen.

Gankra commented 12 years ago

This doesn't seem tractable, as indicated by the ubiquity of "C-Like". Also if you have a string with another language in it, how would you possibly resolve that? Is this some HTML with javascript, or some javascript with HTML?

spratt commented 12 years ago

Our first idea is a Bayesian classifier. Basically build a score for each language, maybe based on the number of keywords matched, and calculate the probability of each language. When the probability passes a certain threshold, make that guess.