Purpose of implementation request:
To learn more javascript and more design patterns in software development, and
to contribute Itteration 2 for a general purpose fieldlinguists tool.
When implementing the request, please focus on these
steps/functions/components:
* turn the word list from an array to a word count map, keep it sorted by frequency and augment the count of the words for each new document it scans (effectively learning from the documents) (3 hours)
* start storing the unknown words in a word count map so that the script can start learning new words (simply add to list of known words) if their counts reach a certain threshold. (2 hours)
When implementing the request, watch out for the following potential
security/lack of access/lack of data/formatting etc hiccups:
* how to do maps in javascript
* watch out if it starts running too slowly, you might have to ask for help from a software engineer to optimize your script (javascript is slow for doing regular expressions). Make a note of slow sections, they might need to be re-factored into an external java library.
Expected next steps see Iteration 3 - Adding rules to "recognize" more words
Original issue reported on code.google.com by gina.c.c...@gmail.com on 25 Nov 2011 at 6:55
Original issue reported on code.google.com by
gina.c.c...@gmail.com
on 25 Nov 2011 at 6:55