Corpus Selection Algorithm Concept

This is potentially a good idea for a Corpus Selection Algorithm (CSA). A Corpus selection is different than Word Selection Algorithm (WSA). WSA selects what problem will come up next. A CSA would make decisions on what words to add to the learner's corpus of words to be learned. Without a CSA, a user would learn by choosing word sets, or by searching for new words.

Prior to reading through this issue please consider its priority. This concept almost certainly cannot be implemented in the near future. However, the pieces necessary for the implementation of the concept appear to be present - thus this should receive some consideration.

Here is the concept: We will have users log in via Facebook ( @sandeshkat figured out how to do this in 2013 ). If we have Facebook data related to user postings, then we could use that posting data to determine user specific information on common L1 words used by the learner. For example, I post (or message) about 'species' and 'plants' with my biology colleagues. If we were able to explore this data for each user, then we could have the user learn L2 words in the order of the user's frequency of use of the corresponding L1 words.

The rank of words within the observed user Corpus may be a better option for next words than the general most common words in the language.

Marc-Bogonovich / Openwords

Corpus Selection Algorithm Concept #67