DS4SD / docling

Get your documents ready for gen AI
https://ds4sd.github.io/docling
MIT License
10.52k stars 508 forks source link

Fix documentation for DocumentStream in usage.md #331

Closed capsenz closed 1 week ago

capsenz commented 1 week ago

Bug

The documentation for DocumentStream in the "Convert from binary PDF streams" contains some wrong naming.

from io import BytesIO
from docling.datamodel.base_models import DocumentStream
from docling.document_converter import DocumentConverter

buf = BytesIO(your_binary_stream)
source = DocumentStream(name="my_doc.pdf", stream=buf)
converter = DocumentConverter()
result = converter.convert(source)

instead of

from io import BytesIO
from docling.datamodel.base_models import DocumentStream
from docling.document_converter import DocumentConverter

buf = BytesIO(your_binary_stream)
source = DocumentStream(filename="my_doc.pdf", stream=buf)
converter = DocumentConverter()
result = converter.convert(source)

...

Steps to reproduce

Ran the example and got a Pydantic error. ...

Docling version

Docling version: 2.5.2 ...

Python version

Python 3.12.7 ...

cau-git commented 1 week ago

Well spotted and thanks for the fix.