Compare current performance with alternative vocab generation method that does not require re-associating the technical words with their origins (i.e. comments, messages. etc). Alternate method involves aggregating text fields in groups by developer and code reviews. Then running these groups through the NLTK scripts one by one and defining their relationships.
Possible Issues:
IO - as the communication method between python and ruby is through files, this change would significantly increase the amount of writes/reads.
Compare current performance with alternative vocab generation method that does not require re-associating the technical words with their origins (i.e. comments, messages. etc). Alternate method involves aggregating text fields in groups by developer and code reviews. Then running these groups through the NLTK scripts one by one and defining their relationships.
Possible Issues:
Advantages: