Closed MartinKl closed 6 years ago
refex-001 à propos du refex-001 cadavre refex-001 qu' refex-001 on refex-001 a refex-001 retrouvé refex-001 au refex-001 quatrième refex-001 étage
This is a perfect example of why a minimal segmentation could be irritating. It might be useful looking at people's search habits and sticking to how things are usually done in ANNIS, but if we had a minimal segmentation, that splits à propos du
in à
, propos
and du
, we still would not be able to get the information from the data, that the referring expression span should only cover du
(which is btw still "too much").
See also #1
related to #2 it is important to know, whether the markable annotations are ALWAYS provided in linear order, e. g. if the multiple-subtoken-token's ("à propos du") annotations ("à propos de" (...) and "le" (det)) are also mentioned in the linear order (for the example first the a-propos-de annotation and then the det annotation). If this is not guaranteed, there might be some difficulties