PAQ8PX README
Copyright (C) 2008 Matt Mahoney, Serge Osnach, Alexander Ratushnyak,
Bill Pettis, Przemyslaw Skibinski, Matthew Fite, wowtiger, Andrew Paterson,
Jan Ondrus, Andreas Morphis, Pavel L. Holoborodko, Kaido Orav, Simon Berger,
Neill Corlett, Márcio Pais, Andrew Epstein, Mauro Vezzosi, Zoltán Gotthardt,
Sebastian Lehmann, Moisés Cardona, Byron Knoll
We would like to express our gratitude for the endless support of many
contributors who encouraged PAQ8PX development with ideas, testing,
compiling, debugging:
LovePimple, Skymmer, Darek, Stephan Busch, m^2, Christian Schneider,
pat357, Rugxulo, Gonzalo, a902cd23, pinguin2, Luca Biondi - and many others.
This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License as
published by the Free Software Foundation; either version 2 of
the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Public License for more details at
Visit <http://www.gnu.org/copyleft/gpl.html>.
For a summary of your rights and obligations in plain language visit
https://tldrlegal.com/license/gnu-general-public-license-v2
PAQ is a series of experimental lossless data compression software that have gone through collaborative development to top rankings on several benchmarks measuring compression ratio (although at the expense of speed and memory usage).
PAQ8PX (this branch) is one of the longest living branch of the PAQ series started by Jan Ondrus in 2009.
PAQ8PX is suitable and recommended for file entropy estimation. However - being an experimental compressor - it is not recommended for production use or for file archival purposes.
For history and current state of development visit https://encode.su/threads/342-paq8px
Data compression is carried out by compressing the input file(s) bit by bit using context mixing i.e. mixing predictions from many models. For a longer (technical) explanation see the DOC file.
A PAQ8PX archive stores a file or multiple files in a highly compressed format.
You can recognize a PAQ8PX archive from its file extension. The file extension reflects the exact PAQ8PX version that created the archive. You may also examine the first few bytes of the file. If it reads "paq8px" then it is (most probably) a PAQ8PX archive. Exact version information cannot be inferred from the archive content.
A PAQ8PX archive may contain multiple files, but once created, you cannot add to or remove files from the archive.
In single file mode (i.e. when compressing just one file) only file contents are stored. File paths, file names, creation or modification dates, file attributes, permissions and other metadata are not preserved. In multiple file mode however, you may store such metadata in the archive (see the help screen for more details).
Notes on pre-training: (1) The exe pre-training function (see the command line flag "e") uses the paq8px.exe file itself to pre-train the exe prediction model, so expect the resulting archives to be different created by two different paq8px.exe executables - even when built with the same command line options from the same source. When the exe is different, the predictions will be slightly different, thus the archives will be (completely) different. As a result you must have the very executable that created an archive in order to extract its contents when you use this pre-training option.
(2) If you use the text pre-training function (see the command line flag "t"), the word list (english.dic) and expression list (english.exp) files are only used for pre-training the prediction model before the actual file compression would begin. However their content is not stored in the archive. Since these files are required to pre-train the model before decompression as well, you will need to have them when extracting files from your archive.
More notes: Files and archives larger than 2 GB are not (yet) supported. PAQ8PX archives are not compatible with previous or future PAQ8PX releases.
This software is portable, you don't need to install it.
You may copy paq8px.exe to the folder where your files to be compressed are located and launch PAQ8PX from the command line (cmd.exe in Windows, a terminal of your choice in Linux, macOS and Android).
A graphical user interface is not provided, PAQ8PX runs from the command line. See the help screen for command line options. For the help screen just launch PAQ8PX from the command line without any options.
Building PAQ8PX requires a C++17 capable C++ compiler: https://en.cppreference.com/w/cpp/compiler_support#cpp17
Windows If you are a Windows user you don't need to compile the source. Just grab the latest executable from the https://encode.su/threads/342-paq8px thread. If you would like to build an executable yourself you may use the Visual Studio solution file or in case of Mingw-w64 see the "build-mingw-w64-generic-publish.cmd" batch file in the build subfolder.
Linux/macOS gcc/clang users on Linux/macOS may use the following commands to build:
sudo apt-get install build-essential zlib1g-dev cmake make cd build cmake .. make
The following compilers were tested and verified to compile/work correctly in this release or in an earlier release:
Note:
We build and test 64-bit releases, 32-bit releases are seldom built or tested. A known limitation of 32-bit releases is the 2 GB memory barrier. As a consequence, compression and decompression with 32-bit releases may not work ("out of memory") on level 8 and above.
When you make a new release: