kriegaex / Galileo-Openbook-Cleaner

HTML cleaner for Galileo Computing's openbooks, implemented in Java
Other
64 stars 6 forks source link

HTML cleaner for Rheinwerk (ex-Galileo) Openbooks

This is a tool for cleaning up Rheinwerk Openbooks (formerly known as Galileo Openbooks) before converting them to EPUB or PDF format.

Current state of development: v1.2.0-SNAPSHOT is feature complete, i.e. it can download, MD5-verify, unpack and convert all 37 Openbooks available at release time.

History: If you want to know details about what has changed in which version, please take a look at the change log.

Download: A precompiled, executable JAR file is available here.

Usage:

# JDK 8 to 15
$ java -jar openbook_cleaner-1.2.0-SNAPSHOT.jar --help

# JDK 16+ needs '--add-opens' because of the XStream library
$ java --add-opens java.base/java.util=ALL-UNNAMED -jar openbook_cleaner-1.2.0-SNAPSHOT.jar --help

OpenbookCleaner usage: java ... [options] <book_id>*

Option                       Description
------                       -----------
-?, --help                   Display this help text
-c, --check-avail            Check Galileo homepage for available books,
                               compare with known ones
-d, --download-dir <File>    Download directory for openbooks; must exist
                               (default: .)
-l, --log-level <Integer>    Log level (0=normal, 1=verbose, 2=debug, 3=trace)
                               (default: 0)
-m, --check-md5              Download all known books without storing them,
                               verifying their MD5 checksums (slow! >1 Gb
                               download)
-t, --threading <Integer>    Threading mode (0=single, 1=multi); single is
                               slower, but better for diagnostics) (default: 1)
-w, --write-config           Write editable book list to config.xml, enabling
                               you to update MD5 checksums or add new books

book_id1 book_id2 ...        Books to be downloaded & converted

Legal book IDs:
  all (magic value: all books), actionscript_1_und_2, actionscript_einstieg,
  apps_iphone_ios6, asp_net, c_von_a_bis_z, dreamweaver_8, excel_2007,
  hdr_fotografie, it_handbuch, javascript_ajax, java_7, java_insel, joomla_1_5,
  linux, linux_unix_prog, microsoft_netzwerk, oop, photoshop_cs2, photoshop_cs4,
  php_pear, ruby_on_rails_2, shell_prog, ubuntu_10_04, ubuntu_11_04,
  ubuntu_12_04, unix_guru, vb_2008, vb_2012_einstieg, vcsharp_2012, vmware,
  windows_server_2012

Dependencies: Openbook cleaner needs at least JDK 8 to run and JDK 17 to build. It also uses a few open source libraries:

Development environment:

Because later I might want to use this Git repository as a refactoring showcase for my developer workshops, I am going to do any refactoring step by step, documenting progress in small, fine-granular Git changesets, so later on I can review the evolutionary progress with others.

As you can see, I am mostly doing this little project for myself, but I like to share the results and receive some user feedback. I hope the Openbook cleaner is useful to you. Enjoy! :-)

Alexander Kriegisch