issues
search
ad-freiburg
/
pdftotext-plus-plus
A fast and accurate command line tool for extracting text from PDF files.
https://pdftotext.cs.uni-freiburg.de
Apache License 2.0
15
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
APT repository broken?
#31
culmat
opened
4 months ago
0
pdftotext ignores words for no discernable reason
#30
DGollings
opened
10 months ago
2
In the help message, add a description for each serialization format.
#29
ckorzen
opened
1 year ago
0
Add extra GitHub workflow for E2E testing.
#28
ckorzen
closed
1 year ago
0
Implement more output formats (TSV, JSON, etc.) and add E2E tests for each.
#27
ckorzen
opened
1 year ago
0
Round coordinates to one decimal point
#26
ckorzen
closed
1 year ago
0
Implement an E2E testing framework
#25
ckorzen
closed
1 year ago
0
Merge PdfDocument.h, PdfFontInfo.h and Types.h
#24
ckorzen
closed
1 year ago
0
Allow to disable each module of the extraction pipeline via command line options.
#23
ckorzen
closed
1 year ago
0
Merge PdfDocument.h, PdfFontInfo.h and Types.h
#22
ckorzen
closed
1 year ago
0
#12 - Merge Constants.h and Config.h
#21
ckorzen
closed
1 year ago
0
Refactor the command line options related to enabling/disbaling specific modules of the extraction pipeline
#20
ckorzen
closed
1 year ago
0
Round coordinates to one decimal point
#19
ckorzen
closed
1 year ago
0
Refactor all pipeline modules
#18
ckorzen
opened
1 year ago
0
Use "using std::*" consequently.
#17
ckorzen
closed
1 year ago
0
Could not load model "src/./models/2021-08-30_model-3K-documents"
#16
hannahbast
closed
1 year ago
2
Fix Github Actions.
#15
ckorzen
closed
1 year ago
0
Get rid of config.yml and yq.
#14
ckorzen
closed
1 year ago
0
Create prebuilt Docker containers, with poppler, tensorflow, etc. preinstalled
#13
ckorzen
opened
1 year ago
0
Merge Constants.h and Config.h
#12
ckorzen
closed
1 year ago
0
'make release' doesn't work.
#11
ckorzen
closed
1 year ago
0
Fix checkstyle issues
#10
ckorzen
closed
1 year ago
0
#2 create options for printing text in different formats
#9
ckorzen
closed
1 year ago
0
Merge the semantic roles from Types.h and the semantic roles used by the learning model
#8
ckorzen
opened
1 year ago
0
Prefix our namespaces with "ppp::"
#7
ckorzen
closed
1 year ago
0
Some refactoring here and there.
#6
ckorzen
closed
1 year ago
0
Write a technical documentation
#5
ckorzen
opened
1 year ago
0
Make the help message more readable.
#4
ckorzen
closed
1 year ago
0
I-1: avoid hyphen in command line usage
#3
ckorzen
closed
1 year ago
0
Add options to output the extracted text in different formats, for example: XML or JSON
#2
ckorzen
closed
1 year ago
0
Avoid having to specify a "-" to print the extracted text on the command line.
#1
ckorzen
closed
1 year ago
0