cboulanger / excite-docker

Docker image with tools for the annotation of ML training docs for reference extraction based on the EXparser tools
https://cboulanger.github.io/excite-docker
GNU General Public License v3.0
0 stars 0 forks source link

Tag-Fragments during Segmentation #10

Open FabianReinold opened 2 years ago

FabianReinold commented 2 years ago

Dear contributors, I encountered two cases of tag-fragments during segmentation that I could not edit, delete or interact with in any way.

  1. After running the Auto-segmentation, the function tends to leave such a fragment of the Last Page tag when the pages in the reference are declared in the format “S. Start Page – “. The fragment occurs after the hyphen.

  2. Sometimes when using the function Correct selected text to add new signs and part of the selected text is already tagged, a fragment of the included tag is left next to the newly added signs. I noticed this multiple times but wasn’t able to find a reliable way to replicate it unfortunately.

You can find examples of these two cases in the screenshots I added.

Tag-FragmentLastPage Tag-FragmentTextEdit

cboulanger commented 2 years ago

Thanks for reporting. This is an annoying bug that has to do with the complexities of HTML encoding and the Selection Web API. Needs to be fixed, but it is not easy.