issues
search
documentcloud
/
docsplit
Break Apart Documents into Images, Text, Pages and PDFs
http://documentcloud.github.io/docsplit/
Other
831
stars
214
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[docsplit image] Horizontal image get thumbnailed on a A4 page at the bottom.
#60
Natim
opened
12 years ago
1
Uses crop box on PDF image creation. Fixes #58
#59
jamesalmond
closed
12 years ago
0
Passing options to GraphicsMagick
#58
jamesalmond
closed
12 years ago
1
extract_pages does not use page range "pages" parameter
#57
rajington
opened
12 years ago
3
HTML/CSS to PDF
#56
lalith-b
closed
12 years ago
1
clean_ocr method removes accents
#55
tdesvenain
closed
12 years ago
1
language parameter is invalid
#54
tdesvenain
closed
11 years ago
6
how can i do multiple pdf extraction processes concurrently?
#53
quyen
closed
11 years ago
5
How cant I convert input image to another image without converts it to pdf before?
#52
luccasmaso
closed
12 years ago
1
TextCleaner garbels german umlauts in recognized text
#51
marcboeker
closed
10 years ago
3
Various improvements
#50
trevorturk
closed
7 years ago
0
Spaces in installation path
#49
ineiti
closed
11 years ago
1
Detect PDF files without .pdf extension using magic number
#48
jeremybmerrill
closed
10 years ago
4
Can't run the tests
#47
dentarg
closed
11 years ago
1
Added ability to extract all metadata at once
#46
rajington
closed
11 years ago
1
Adds recommendation to install poppler-data
#45
alindeman
closed
12 years ago
0
Won't work if docsplit is used by multiple unix users
#44
oelmekki
opened
12 years ago
3
confusing file location when extract_images from pdf
#43
michelson
closed
12 years ago
1
Use the extract_text data in Ruby rather than a file
#42
geomic
closed
12 years ago
2
DocSplit fails to extract text on Windows when filenames have spaces
#41
overview
opened
12 years ago
1
PDFs with rotated pages are clipped
#40
rajington
closed
12 years ago
2
extract_text ignores new lines
#39
mattvv
closed
11 years ago
2
Java chokes on paths with spaces
#38
ineiti
closed
11 years ago
7
Image -> OCR -> PDF
#37
gaiottino
closed
12 years ago
1
White border in output when the input are images
#36
luccasmaso
opened
12 years ago
1
Accept non-ascii characters in pdf headers
#35
stuartf
closed
11 years ago
6
Timeout for PDF extraction from OpenOffice supported document format.
#34
vrybas
opened
12 years ago
4
Need extract_html
#33
ManikandanK
closed
12 years ago
1
Ignore non-ascii chars in extracted PDF info.
#32
efroese
closed
11 years ago
4
Make tests compatible with Ruby 1.9.2
#31
rmoriz
closed
10 years ago
2
Docsplit and CarrierWave
#30
shlomizadok
closed
13 years ago
0
qpdf decryption
#29
palewire
closed
5 years ago
1
OWNER PASSWORD REQUIRED ERROR
#28
palewire
closed
1 year ago
7
Distorted PNG image when using docsplit images
#27
f0urfingeredfish
closed
13 years ago
5
undefined method normalize_range for Docsplit:Module
#26
sandstrom
closed
10 years ago
1
How to use with paperclip?
#25
shlomizadok
closed
13 years ago
3
Feature multiple languages
#24
crutch
closed
13 years ago
1
Make the tests portable
#23
kremso
closed
13 years ago
0
Support libre office as office home param for java
#22
crutch
closed
13 years ago
0
File --mime-type option unrecognized on CentOS
#21
simeonwillbanks
closed
13 years ago
0
Extracting text from PDFs
#20
runa
opened
13 years ago
1
Add `brew` command to installation part of gh-page.
#19
edtsech
closed
13 years ago
2
Option for binary blobs/file handles?
#18
thsig
closed
13 years ago
1
At least on my version of Ubuntu (Natty Narwhal) the tesseract library is
#17
palewire
closed
13 years ago
1
Expose density arg in ImageExtractor
#16
zagraves
closed
13 years ago
1
jodconverter --external
#15
gregtap
closed
13 years ago
1
Command line docsplit displays pdftotext usage when inputing a PDF filename that has spaces
#14
matthewmueller
closed
13 years ago
2
Inspect file mime-type
#13
simeonwillbanks
closed
13 years ago
4
11 escape incoming file names
#12
vrybas
closed
13 years ago
3
shell scaping needed e.g. for filenames
#11
rmoriz
closed
13 years ago
1
Previous
Next