allenai / mmda

multimodal document analysis
Apache License 2.0
158 stars 18 forks source link

Kylel/2022 12/debug dwp #183

Closed kyleclo closed 1 year ago

kyleclo commented 1 year ago

Resolve this issue:

 File "/usr/local/lib/python3.8/site-packages/mmda/predictors/heuristic_predictors/dictionary_word_predictor.py", line 149, in predict
    token_id_to_word_id, word_id_to_text = self._predict_tokens(
  File "/usr/local/lib/python3.8/site-packages/mmda/predictors/heuristic_predictors/dictionary_word_predictor.py", line 402, in _predict_tokens
    assert None not in token_id_to_word_id.values()
AssertionError

Basically, found singletons with - hyphen that were breaking row logic. Those are classified correctly now.