klippa-app / pdfium-cli

Easy to use PDF CLI tool powered by PDFium and go-pdfium
MIT License
21 stars 2 forks source link

pdfium-cli

Build Status

:rocket: Easy to use PDF CLI tool powered by PDFium and go-pdfium :rocket:

Features

PDFium & Wazero

This project uses the PDFium C++ library by Google (https://pdfium.googlesource.com/pdfium/) to process the PDF documents.

We use a Webassembly version of PDFium that is compiled with Emscripten and runs in the Wazero Go runtime.

Getting started

From binary

Download the binary from the latest release for your platform and save it as pdfium.

You can also use the install tool for this:

sudo install pdfium-webassembly-linux-x64 /usr/local/bin/pdfium

Release types

The following release types are available:

WebAssembly: this is a single binary that includes everything that you need to run pdfium-cli, but is a lot slower than native due to the WebAssembly runtime. Most useful if speed is not a concern and easy distribution is more important.

Native: A native build that requires pdfium and libjpeg-turbo to be available on your system.

Native + MUSL: Same as native but built with MUSL so that it does not require a system libc which allows it to be used in Alpine Docker containers.

From source

Make sure you have a working Go development environment.

Clone the repository:

git clone https://github.com/klippa-app/pdfium-cli.git

Move into the directory:

cd pdfium-cli

Run the command:

go run main.go

Or to compile and run pdfium-cli:

go build -o pdfium main.go
./pdfium -h

Output:

pdfium-cli is a CLI tool that allows you to use pdfium from the CLI

Usage:
  pdfium [command]

Available Commands:
  attachments Extract the attachments of a PDF
  completion  Generate the autocompletion script for the specified shell
  explode     Explode a PDF into multiple PDFs
  help        Help about any command
  images      Extract the images of a PDF
  info        Get the information of a PDF
  javascripts Extract the javascripts of a PDF
  merge       Merge multiple PDFs into a single PDF
  render      Render a PDF into images
  text        Get the text of a PDF
  thumbnails  Extract the attachments of a PDF

Flags:
  -h, --help   help for pdfium

Use "pdfium [command] --help" for more information about a command.

The following build tags are available to control different build types:

About Klippa

Founded in 2015, Klippa's goal is to digitize & automate administrative processes with modern technologies. We help clients enhance the effectiveness of their organization by using machine learning and OCR. Since 2015, more than a thousand happy clients have used Klippa's software solutions. Klippa currently has an international team of 50 people, with offices in Groningen, Amsterdam and Brasov.

License

The MIT License (MIT)

Wazero and PDFium come with the Apache License 2.0 license