issues
search
documentcloud
/
docsplit
Break Apart Documents into Images, Text, Pages and PDFs
http://documentcloud.github.com/docsplit/
Other
833
stars
214
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Docsplit images command - Added the ability to specify the page number delimiter via a command line option
#110
BrandonNoad
opened
10 years ago
0
libreoffice path in FreeBSD
#109
danniculescu
opened
10 years ago
0
making magic number-based detection of PDFs encoding-friendly, with tests
#108
jonoterc
closed
10 years ago
3
undefined method `strip' for nil:NilClass
#107
singhkishan
closed
9 years ago
1
"Invalid byte sequence error" on master.
#106
KurtPreston
closed
10 years ago
4
Create frases
#105
Merinoowe
closed
10 years ago
2
Add office search path to check vendor folder for use with Heroku and libreoffice buildpack
#104
serene
closed
10 years ago
2
Minor changes
#103
tmaier
opened
10 years ago
0
Check if file is PDF by magic number. Closes #98
#102
tmaier
closed
10 years ago
3
Add Gemfile
#101
tmaier
closed
10 years ago
5
Allow use of imagemagick with docsplit
#100
augustf
closed
10 years ago
0
Rubygems release no longer working with recent openoffice versions on Debian/Ubuntu
#99
augustf
closed
10 years ago
2
TransparentPDFs should not only check for file extension but also for Mime Type
#98
tmaier
closed
10 years ago
0
Fix for Issue #83: Leading Zeros
#97
theredcoder
opened
10 years ago
2
Extracting images from PDF hogs 100% CPU
#96
tvsignal
closed
10 years ago
2
conversion to PDF mangles non-ASCII characters in docx on Linux
#95
bobmyers
closed
9 years ago
4
Issues with Powerpoint OLE Objects
#94
omsoft
opened
10 years ago
2
Error converting to images, in 0.7.2 , but works in 0.6.3
#93
michelson
closed
9 years ago
1
Enable specification of a config file, and generate hocr output if option set
#92
jhosteny
opened
11 years ago
4
Pad them digits
#91
dannguyen
closed
11 years ago
5
Extract image or pdf on windows platform bugfix
#90
eastxing
opened
11 years ago
1
Extract images on win 7 platform error
#89
eastxing
opened
11 years ago
0
Add /usr/lib64 to office_search_paths
#88
elia
closed
11 years ago
0
Can't covert nil into string in ensure_pdfs on server, but works fine locally
#87
chintanparikh
opened
11 years ago
0
extract_text doesn't work for pdf files with Tesseract
#86
chintanparikh
closed
11 years ago
12
Default 64-bit installation paths
#85
vanderhoorn
closed
9 years ago
0
Detect page orientation and rotate when necessary
#84
lukerosiak
closed
9 years ago
5
Use leading zeros for appended page numbers in extract_images
#83
willmcclellan
closed
10 years ago
2
Clean text without Iconv to support Ruby 2.0
#82
leknarf
closed
11 years ago
1
Add option to generate hOCR output instead of raw text when performing OCR via tesseract
#81
jhosteny
closed
11 years ago
4
Add option to generate hOCR output from tesseract
#80
jhosteny
closed
11 years ago
5
Timeout on large xlsx files (with many pages in print preview)
#79
alxndrmlr
opened
11 years ago
3
Pseudo password protected xlsx files can't be converted
#78
alxndrmlr
opened
11 years ago
2
Deploy to heroku
#77
josal
opened
11 years ago
1
Unable to extract images using docsplit 0.7.2 in cygwin
#76
bjayaram
opened
11 years ago
1
Add another possible LibreOffice executable path
#75
va7map
closed
9 years ago
1
typo fix for win
#74
sumkincpp
closed
11 years ago
2
Couldn't open file '/tmp/docsplit/filename.pdf': No such file or directory.
#73
luccasmaso
closed
11 years ago
5
Not saving Unicode (UTF8) characters (accents in other languages)
#72
robertour
closed
10 years ago
4
Unable to use LibreOffice on version 0.7.2 -> Could not find or load main class .usr.lib.libreoffice
#71
aponsin
closed
10 years ago
18
No error output
#70
patroy
closed
10 years ago
2
Bug where strange text is being overlaid to extracted image (pptx to png)
#69
avlakin
opened
11 years ago
3
new libreoffice has --version
#68
senner
closed
10 years ago
4
Notifications on error
#67
thunderz14enator
closed
11 years ago
0
Notifications on error
#66
thunderz14enator
opened
11 years ago
2
Accept non-ascii characters in pdf headers
#65
amalagaura
closed
11 years ago
4
PDF to SVG
#64
shlomizadok
closed
11 years ago
4
Fix issue 62
#63
hderms
opened
11 years ago
11
Determining the paths of images created as a result of Docsplit.extract_images
#62
hderms
opened
11 years ago
3
Ghostscript is needed to use docsplit with PDF files
#61
evanj
closed
11 years ago
2
Previous
Next