DS4SD / docling

Get your documents ready for gen AI
https://ds4sd.github.io/docling
MIT License
10.48k stars 507 forks source link

Document normalization: warning on `checkbox-unselected` #399

Open pierre-sigwalt opened 1 day ago

pierre-sigwalt commented 1 day ago

Question

Hello,

Thank you for the awesome work, I having a warning when converting a PDF and was wondering 2 things:

  1. What does this warning means?
  2. Is there a way to disable it so it don't pollute my logs?

Warnings:

2024-11-21 12:59:17.367 (   6.660s) [        92BD3300]    doc_normalisation.h:448   WARN| found new `other` type: checkbox-unselected
2024-11-21 12:59:17.367 (   6.660s) [        92BD3300]    doc_normalisation.h:448   WARN| found new `other` type: checkbox-unselected

Thank you

PeterStaar-IBM commented 1 day ago

@pierre-sigwalt I think this has to do with the underlying layout model. It should not hurt, but it is not great either.

@cau-git @sh-gupta let's get to the bottom of this!