Speedup merger and support several mergers per instance - Githubissues

sashafrey / topicmod

This project had been moved to https://github.com/bigartm/bigartm

Other

0 stars 0 forks source link

Speedup merger and support several mergers per instance #82

Open sashafrey opened 10 years ago

sashafrey commented 10 years ago

We don't have enough data yet to say for sure, but it is likely that Merger can be a bottleneck in some important scenarios. For example:

set inner_iterations_count = 1, or
have 16 concurrent processors, or
merging increments on master_component in network modus operandi We should design and implement an option to run multiple merger threads per instance.

In parallel we should research what are the current bottlenecks in the merger.

Is it expensive to always lookup each token in the token_to_tokenid map?
Is it expensive to always send the entire token-topic matrix from nodes to the master? Most likely yes, and we should carefully produce modelincrement to send it from nodes to the master.
Is it now time to look at sparsity of the Phi matrix? This should improve the throughput of Merger component.