Closed vmonakhov closed 1 month ago
Here we get annotations for composite words in timarkh_uniparser function: https://github.com/ispras/lingvodoc/blob/faf43c03332934e3b6d8f8062bb81a95b9aad026/lingvodoc/utils/doc_parser.py#L111 Further we find composite words with obtained empty annotations and split such words into simple ones, then try to find annotations for new set once more. So we don't store non-empty annotations for composite words, it seems like a logical mistake.
Actually all the words are processed, but composite words are processed twice. This is not bug, this is just non-optimality.
Here we get annotations for composite words in timarkh_uniparser function: https://github.com/ispras/lingvodoc/blob/faf43c03332934e3b6d8f8062bb81a95b9aad026/lingvodoc/utils/doc_parser.py#L111 Further we find composite words with obtained empty annotations and split such words into simple ones, then try to find annotations for new set once more. So we don't store non-empty annotations for composite words, it seems like a logical mistake.