Open GillesJ opened 2 days ago
Hello,
I think the problem is with special characters. Because in the image it shows that this part did get highlighted:
ENGIE outlook for 2023 to 2025
but right after :
it doesn't get highlighted because its on different line. Similar to the other one where you have ####
.
Thank you, Abu
Comment by Abubakar Saad Workflow Run
Describe the bug I use a sentence splitter to divide my plaintext to annotate into regions for annotation of choices (my corpus is multilingual so I need language-specific sentence splitting for good results).
For nearly every document there are intermittent regions which do not get displayed and are not selectable. The only consistently shared property of missing regions is that they contain newlines
\n
and/or punctuation. However, similar regions containing newlines internally or at boundaries are displayed correctly.data.text
and newlines\n
count as 1 char length.\n\n
at region boundaries or inside regions. But this does not seem the case because many regions that have multiple newlines internally or at boundaries are displayed correctly.To Reproduce Steps to reproduce the behavior:
Labeling Interface code:
Generate text regions in the format below, not the
value.text
key is not present in my production files but is there for illustrative purposes (problem still occurs withvalue.text
omitted from task json.).I attached an example of a full json task file with many missing regions: example-missing-regions.json
Here is the full example with the
value.text
on the regions for debugging: example-with-text-missing-regions.jsonExpected behavior Text regions are highlighted and shown correctly.
Environment (please complete the following information):