Closed sashafrey closed 10 years ago
Fixed in https://github.com/sashafrey/topicmod/tree/alfrey_auto_discover_tokens
New behaviour: If processor observes a token that is not part of token-token matrix, it stores this token in the list of new ``discovered'' tokens, and transfers this list as part of processor output. Merger picks up all such tokens, and initializes new row in token-topic matrix. So, during the first scan over the collection the dictionary is gathered automatically.
Currently merger resets the whole model whenever it see new generation. There are many better ways to do this.