qe-team / marmot

MARMOT - the open source framework for feature extraction and machine learning, designed to estimate the quality of Machine Translation output
ISC License
21 stars 7 forks source link

the alignment feature extractor should throw an error if it gets a context object without both source and target fields #4

Closed chrishokamp closed 9 years ago

chrishokamp commented 9 years ago

if we include the alignment feature extractor in the config for a context creator which doesn't have both source and target fields, we'll get this error:

...
File "/home/chris/projects/quality_estimation/marmot/marmot/util/alignments.py", line 54, in align_sentence
    align_str = aligner.align(' '.join(src_line)+u' ||| '+' '.join(tg_line))
TypeError: can only join an iterable

This is because one or both of the fields [src_line, tg_line] is None, so the context object doesn't have it. The feature extraction should fail at this point, because it means that the user is trying to extract an alignment from an object that doesn't have both fields.

chrishokamp commented 9 years ago

you can reproduce this error:

cd marmot/experiment
nosy

One test will fail because of this error.

varvara-l commented 9 years ago

I've fixed it, will push soon. Maybe we should check if all contexts have the same set of features, and that all feature extractors we want to call have all data they need. If they don't we can just drop some feature extractors and print a warning (instead of raising an error or printing a warning for every example).

chrishokamp commented 9 years ago

Great! I think the experiment should fail completely if the user is asking for feature extractors that are incompatible with their data. Otherwise they may not understand why features are being dropped (a warning isn't useful if they're using marmot as a component in another application).

Maybe we should check if all contexts have the same set of features, and that all feature extractors we want to call have all data they need.

Yes we should check if all feature extractors are compatible with all contexts, and if they're not, we should error out (with a good pointer on how to fix the problem).

On Thu, Jan 15, 2015 at 3:03 PM, varvara-l notifications@github.com wrote:

I've fixed it, will push soon. Maybe we should check if all contexts have the same set of features, and that all feature extractors we want to call have all data they need. If they don't we can just drop some feature extractors and print a warning (instead of raising an error or printing a warning for every example).

— Reply to this email directly or view it on GitHub https://github.com/qe-team/marmot/issues/4#issuecomment-70098212.

varvara-l commented 9 years ago

Done.