issues
search
pymupdf
/
RAG
RAG (Retrieval-Augmented Generation) Chatbot Examples Using PyMuPDF
https://pymupdf.readthedocs.io/en/latest/pymupdf4llm
GNU Affero General Public License v3.0
302
stars
57
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Improve types for to_markdown function
#46
dantetemplar
closed
3 months ago
1
Accept Path object in to_markdown function
#45
dantetemplar
closed
3 months ago
10
Instead of file path in string would also like to pass fitz object
#44
narsandu
closed
3 months ago
2
Draft: updates logo.
#43
jamie-lemon
closed
3 months ago
0
Embedded links inside the table are not extracted
#42
narsandu
opened
3 months ago
1
Unable to parse 2-column documents
#41
rahul-dhir-0047
closed
3 months ago
4
Unable to parse double column pdf
#40
smallzhao
closed
3 months ago
2
typing errors
#39
fareshan
closed
3 months ago
2
Error when loading pdf using python BytesIO: object has no attribute 'page_count'
#38
danmb1979
closed
3 months ago
1
Moves examples into dedicated folder with README.
#35
jamie-lemon
closed
4 months ago
0
Suggestion: possibility to have a callback on table
#33
papipsycho
closed
3 months ago
1
Integration into langchain
#32
zymbuzz
closed
4 months ago
1
adding dpi setting and fixing remaining image output
#31
jakubkovac
closed
3 months ago
0
issue with Heading since version 0.0.5
#30
papipsycho
closed
3 months ago
5
Bug in pymupdf_rag
#29
SergioG-M
closed
4 months ago
1
Update image inclusion
#28
JorjMcKie
closed
4 months ago
0
Update pymupdf_rag.py to fix compatibility issue with Python 3.9
#27
shenyimings
closed
4 months ago
1
Remove headers & Footers
#26
G-Slient
closed
4 months ago
1
Update api.rst
#25
JorjMcKie
closed
4 months ago
0
Adds more info to homepage for latest version and starts API doc.
#24
jamie-lemon
closed
4 months ago
2
Many text block became image since the version 0.0.3
#23
papipsycho
closed
4 months ago
13
No attribute to_markdown
#22
aman-vink
closed
4 months ago
4
Images in table
#21
chillyoung4679
opened
4 months ago
1
[Llama] Fixed Async load
#20
YanSte
closed
4 months ago
0
Table text without new lines
#19
papipsycho
closed
3 months ago
3
Image background can cause text extraction to fail
#18
maxjeblick
closed
4 months ago
3
'pymupdf4llm' has no attribute 'to_markdown'
#17
zzzcccxx
closed
4 months ago
2
add expected type and some typings
#16
fareshan
closed
4 months ago
1
[Llama] Improved int page structure and naming
#13
YanSte
closed
4 months ago
2
[Hotfix] Fixed import and method used for version 0.0.2 (Import issue)
#12
YanSte
closed
4 months ago
10
No module named 'get_text_lines' in pymupdf4llm
#15
vzegna
closed
4 months ago
4
Preserving page information when creating markdown file from pdf
#11
SamGalanakis
closed
4 months ago
2
improve typings to accept range and None
#10
fareshan
closed
4 months ago
2
Update logic to set same_line flag based on case of first character
#9
tamdao
closed
4 months ago
0
to_markdown() for two-column pdf
#8
MahtabF
closed
4 months ago
3
[PDFMardownReader] LlamaIndex Reader
#7
YanSte
closed
4 months ago
8
Ignore the header and footer of PDF
#6
difonjohaiv
closed
4 months ago
2
Avoid add header to an empty line
#5
Louis-udm
closed
3 months ago
2
[Suggestion] PDFReader with LlamaIndex BaseReader and insertion in Llama Hub
#4
YanSte
closed
4 months ago
4
Not all PDFs have fontsizes
#3
bartdegoede
closed
5 months ago
1
Generation Speed
#2
fareshan
closed
5 months ago
2
Specified encoding when opening to fix undefined characters
#1
dragoa
closed
5 months ago
1
Previous