issues
search
useblocks
/
libpdf
Extract structured data from PDFs
MIT License
8
stars
2
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Mh update pillow 2024 - Fix tests and extraction of invalid bboxes
#42
kreuzberger
opened
3 weeks ago
0
Update pillow to 10.2.0
#41
ubmarco
opened
7 months ago
0
Ruff fixes
#40
ubmarco
closed
7 months ago
0
Color and font information for chars, words and boxes
#39
kreuzberger
closed
6 months ago
5
Color information for text / words characters
#38
kreuzberger
closed
6 months ago
2
Tests for the new Rects class
#37
ubmarco
closed
6 months ago
1
Adding tests for rect extractions
#36
kreuzberger
closed
7 months ago
5
Bump actions/checkout from 2 to 4
#35
dependabot[bot]
closed
8 months ago
0
Bump actions/setup-python from 2 to 5
#34
dependabot[bot]
closed
8 months ago
0
Preparation for 0.1.0
#33
ubmarco
closed
8 months ago
0
add links to rects
#32
juiwenchen
opened
8 months ago
0
Run ruff format check
#31
ubmarco
closed
8 months ago
0
Rect model
#30
juiwenchen
closed
8 months ago
9
Reformat the codebase with ruff
#29
ubmarco
closed
8 months ago
0
Added ruff for linting and formatting
#28
ubmarco
closed
8 months ago
0
Fix the CI and expand Python versions
#27
ubmarco
closed
8 months ago
0
non-functional change
#26
juiwenchen
closed
8 months ago
1
Color Information for Paragraphs
#25
kreuzberger
closed
8 months ago
11
fix dependencies from pillow branch and fix processing of pdf files
#24
kreuzberger
closed
8 months ago
2
Raised pillow dependency
#23
ubmarco
closed
6 months ago
1
defined virtual number and ghost chapter
#22
juiwenchen
closed
2 years ago
0
Fixed table cell word boundaries
#21
haiyangToAI
closed
2 years ago
0
Extracted table cells have no word boundaries
#20
ubmarco
closed
2 years ago
0
fixed table id for issue#18
#19
juiwenchen
closed
2 years ago
1
Duplicate table IDs
#18
ubmarco
closed
2 years ago
0
Removed Python 3.6 and updated deps
#17
ubmarco
closed
2 years ago
0
test
#16
juiwenchen
closed
2 years ago
1
Flag to disable annotation extraction
#15
juiwenchen
closed
2 years ago
1
Extract text from images using tesseract-ocr
#14
ubmarco
opened
2 years ago
1
Detect headlines in PDFs without outline
#13
ubmarco
opened
2 years ago
1
Updated dependencies
#12
ubmarco
closed
2 years ago
0
Adapted changelog for bugfix
#11
haiyangToAI
closed
2 years ago
0
Fix outline title
#10
haiyangToAI
closed
2 years ago
0
Chapter recognition on AUTOSAR requirements
#9
juiwenchen
closed
3 years ago
0
fixed chapter recognition bugs
#8
juiwenchen
closed
3 years ago
0
Added example for poppler & pillow
#7
ubmarco
opened
3 years ago
0
Catch wand/ImageMagick policy error exception
#6
ubmarco
opened
3 years ago
0
Pos chars words
#5
juiwenchen
closed
3 years ago
1
Resolve pdfplumber fork
#4
ubmarco
opened
3 years ago
0
Resolve pdfminer fork
#3
ubmarco
opened
3 years ago
0
Position of words and characters
#2
ubmarco
closed
3 years ago
0
Fixed catalog annotation
#1
haiyangToAI
closed
3 years ago
0