qe-team / marmot

MARMOT - the open source framework for feature extraction and machine learning, designed to estimate the quality of Machine Translation output
ISC License
21 stars 7 forks source link

parallelization is extremely slow for sequence models #20

Open chrishokamp opened 9 years ago

chrishokamp commented 9 years ago

the parallelization in preprocessing_utils is extremely slow for the sequence representation, because this function gets called once for every sequence. For this case, the parallelization code is wrong, because we should be parallelizing the processing of each sequence, not parallelizaing for each context in a single sequence.

chrishokamp commented 9 years ago

gets called via call_for_each_element

chrishokamp commented 9 years ago

using 1 worker is currently 100x faster