gobics / uproc

Tools for ultra-fast protein sequence classification.
http://uproc.gobics.de/
GNU Lesser General Public License v3.0
5 stars 3 forks source link

##################################### UProC Version 1.2.0 2015-03-09 README #####################################

.. This document is formatted using reStructuredText (http://docutils.sourceforge.net/rst.html). You can either view it as plain text or use the python docutils package to render it to a prettier format, e.g. HTML. A HTML version is also available at http://uproc.gobics.de

.. contents:: .. sectnum::

============= General Notes

Contact

Please send bug reports, suggestions or questions to uproc@gobics.de or create an issue at UProC on GitHub_.

.. _UProC on GitHub: https://github.com/gobics/uproc

Version numbering

UProC uses semantic versioning_. This means that the version number has the format MAJOR.MINOR.PATCH and an increment in one of the parts has the following meaning:

  1. MAJOR: The user interface (input, output and/or command-line arguments) of the programs or the format of the supplementary files was changed in a backwards-incompatible manner.
  2. MINOR: Some functionality was added.
  3. PATCH: One or more bugs were fixed.

If MINOR or PATCH are incremented, all modifications are backwards-compatible. See the semantic versioning_ homepage for details.

.. _semantic versioning: http://semver.org

License

The code for the UProC executables is available under the GNU GPLv3 (see the COPYING file). However, most of the functionality is separated into a library called libuproc, using the GNU LGPLv3 (see the COPYING.LESSER file).

.. GNU GPLv3: https://www.gnu.org/licenses/gpl.html .. GNU LGPLv3: https://www.gnu.org/licenses/lgpl.html

============ Installation

Obtaining UProC

Please visit the UProC homepage to make sure you're using the latest release. If you are interested in contributing, please get a clone of UProC on GitHub and base your work off the devel branch.

.. _UProC homepage: http://uproc.gobics.de

Binary packages

We currently offer binary packages for Windows XP and later, for 32 bit (i686) and 64 bit (x86-64, also known as amd64) architectures. To install, just download and extract the corresponding zip file. Currently there is no graphical user interface, you need to run UProC from a PowerShell or cmd.exe.

Please note that UProC is primarily developed for (and works best on) GNU/Linux.

Installation from source package

UProC uses the GNU Autotools_, so installation should work on any system that has a POSIX compatible /bin/sh, make and a C99 compiler.

For additional features, UProC has the following optional dependencies:

zlib (highly recommended) Used for On-the-fly gzip (de-)compression of files. This is probably already installed on your computer, but you need to make sure that the development headers are also available.

OpenMP UProC uses OpenMP for parallelization, if the compiler suports it. The configure script automatically detects whether this is the case.

Doxygen (for developers, version 1.8.0 or later recommended) If Doxygen is available, it wil be used to generate the the API documentation of libuproc (which is also available at the UPRoC homepage_.

.. _GNU Autotools: http://www.gnu.org/software/automake/manual

To install UProC from a source tarball, simply extract it and run the commands ::

    ./configure
    make
    make install  # requires appropriate privileges in PREFIX (see below)

By default, the executables are built in a way that you can omit the make install step and just run them from wherever you wish or copy them to a desired location (it is recommended to utilize the --prefix option mentioned below, though).

You can pass the following options (and more) to the configure script:

--prefix=PREFIX Install all files below PREFIX (see the Installation Names section the INSTALL file for details).

--disable-openmp Do not use OpenMP (i.e. disable multi-threading). This will usually cause longer running times. You probably don't want this.

--disable-shared Link all execututables statically and do not build and install libuproc as a shared library.

--enable-mmap Enable the mmap database format. This is enabled by default only on GNU/Linux, as it can have a negative impact on the performance on other OSes (we experienced this while running UProC on OSX 10) if the database is stored on a HDD. If you have an SSD, enabling this feature can reduce startup time significantly.

See the INSTALL file for a more detailed description of the installation process for projects using GNU Autotools_.

Installation from Git

On a fresh clone of the git repo, run autoreconf -i to generate the configure script and then follow the instructions in the previous section.

=================== Supplementary files

UProC needs certain files at runtime. These files are split into two categories, usually available as two distinct directories in the file system.

Database

The database consists of files representing a set of known protein subsequences that map to given families, e.g. extracted from PFAM.

There are two ways to obtain a database:

  1. You can download a database from the UProC homepage_ and import it with the uproc-import program.
  2. Alternatively, you can create your own database with the uproc-makedb program.

Detailed instructions for these programs can be found by passing the -h option when running them.

Model

The model consists of files containing certain parameters that are not tied to a particular database. You can download the newest model files from the UProC homepage_.

============= Running UProC

UProC consists of the following command-line programs:

uproc-prot Protein sequence classifier.

uproc-dna DNA/RNA sequence classifier.

uproc-orf Command-line interface to the ORF translation mechanism used by uproc-dna.

uproc-import Import database.

uproc-export Export database.

uproc-makedb Create a new database.

You can pass the -h option to find out how they are used.

.. vim: ft=rst