dimi2 / DyAnnotationExtractor

DyAnnotationExtractor is software for extracting annotations (highlighted text and comments) from e-documents like PDF.
Apache License 2.0
37 stars 3 forks source link
annotations extract highlight pdf

DyAnnotationExtractor

DyAnnotationExtractor is software for extracting annotations (highlighted text and comments) from e-documents like PDF. The extracted parts can be used to build summary/resume of the document.

Usage

Imagine you have ebook (PDF) which is 100 pages long. While reading the book, you highlight the important parts in your favorite reader:

Then use the DyAnnotationExtractor tool to get just the highlighted parts.

On the comman line execute following command.
For Windows:

DyAnnotationExtractor -input "Getting Started with Ubuntu 16.04.pdf"

For Linux:

./DyAnnotationExtractor.sh -input "Getting Started with Ubuntu 16.04.pdf"

This will create a file with same name in the same directory, with added '.md' suffix.

Now you have extract of the book which is not 100 but 5-6 pages. So, you can skim just the exported text instead of re-reading the entire book.

Supported Input Formats

Supported Output Formats

Requirements

Download

Get the latest release.

There are separate files for: distribution, binary and sources.
End users need to download only the distribution.

Installation

Extract the downloaded archive in some local directory.
Run the provided 'DyAnnotationExtractor' script to perform extraction.

Build

To build the project from sources, you will need Gradle build tool. Go into the project home directory (PROJ_HOME) and execute command:

gradle

The result will appear in directory PROJ_HOME/build/distributions.

Dependencies