elmindreda / duff

Command-line utility for finding duplicate files
Other
94 stars 12 forks source link
c duplicate-files unix

duff - Duplicate file finder

  1. Introduction

Duff is a command-line utility for identifying duplicates in a given set of files. It attempts to be usably fast and uses the SHA family of message digests as a part of the comparisons.

Duff resides in public Git repository on GitHub:

https://github.com/elmindreda/duff

The version numbering scheme for duff is as follows:

  1. License and copyright

Duff is copyright (c) 2005 Camilla Löwy elmindreda@elmindreda.org

Duff is licensed under the zlib/libpng license. See the file `COPYING' for license details. The license is also included at the top of each source file.

Duff contains shaX-asaddi. Copyright (c) 2001-2003 Allan Saddi allan@saddi.com See the files src/sha*.c' andsrc/sha*.h' for license details.

Duff uses the gettext.h convenience header from GNU gettext. Copyright (C) 1995-1998, 2000-2002, 2004-2006, 2009 Free Software Foundation, Inc. See the `lib/gettex.h' for license details.

Duff comes with a number of files provided by the GNU autoconf, automake and gettext packages. See the individual files in question for license details.

  1. Project news

See the file `NEWS'.

  1. Building Duff

If you got this source tree from a Git repository then you will need to bootstrap the build environment using first gettextize --no-changelog' and then autoreconf -i'. Note that this requires that GNU autoconf, automake and the gettext development tools are installed.

If (or once) you have a `configure' script, go ahead and run it. No additional magic should be required. If it is, then that's a bug and should be reported.

This release of duff has been successfully built on the following systems:

Ubuntu Natty x86_64

Earlier releases have been successfully built on the following systems:

Arch Linux x86 Cygwin 1.7 i686 Darwin 7.9.0 powerpc Debian Etch powerpc Debian Etch x86 Debian Lenny x86 Debian Sarge alpha Debian Wheezy amd64 FreeBSD 4.11 x86 FreeBSD 5.4 x86 FreeBSD 8.2 i386 Mac OS X 10.3 powerpc Mac OS X 10.4 powerpc Mac OS X 10.6 i386 Mac OS X 10.6 x86_64 Mac OS X 10.6 x86_64 (with MacPorts gettext) Mac OS X 10.7 x86_64 NetBSD 1.6.1 sparc Red Hat Enterprise 4.0 x86 SunOS 5.9 sparc64 Ubuntu Breezy x86 Ubuntu Jaunty x86 Ubuntu Lucid amd64 Ubuntu Maverick amd64

The tools used were GCC and GNU or BSD make. However, it should build on most Unix systems without modifications.

  1. Installing Duff

See the file `INSTALL'.

  1. Using Duff

See the accompanying man page duff(1).

To read the man page before installation, use the following command:

groff -mdoc -Tascii man/duff.1 | less -R

On GNU/Linux systems, however, the following command may suffice:

man -l man/duff.1

  1. Hacking Duff

See the file `HACKING'.

  1. Bugs, feedback and patches

Please send bug reports, feedback, patches and cookies to:

Camilla Löwy elmindreda@elmindreda.org

  1. Credits and thanks

The following (alphabetically listed) people have contributed to duff, either by reporting bugs, suggesting new features or submitting patches:

Harald Barth Alexander Bostrom Magnus Danielsson Stephan Hegel Patrik Jarnefelt Rasmus Kaj Mika Kuoppala Richard Levitte Fernando Lopez Clemens Lucas Fries Kamal Mostafa Ross Newell Allan Saddi allan@saddi.com

...and everyone I forgot. Did I forget you? Drop me an email.

  1. Disambiguation

This is duff the Unix command-line utility, not DUFF the Windows program. If you wish to find duplicate files on Windows, use DUFF.

DUFF also has a SourceForge.net URL:

http://dff.sourceforge.net/

  1. Release history

Version 0.1 was named `duplicate' and was never released anywhere.

Version 0.2 was the first release named duff. It lacked a real checksumming algorithm, and was thus only released to a few individuals, during the first half of 2005.

Version 0.3 was the first official release, on November 22, 2005, after a long search for a suitably licensed implementation of SHA1.

Version 0.3.1 was a bugfix release, on November 27, 2005, adding a single feature (-z), which just happened to get included.

Version 0.4 was the second feature release, on January 13, 2006, adding a number of missing and/or requested features as well as bug fixes. It was the first release to be considered stable and safe enough for everyday use.

Version 0.5 was the third feature release, on April 11, 2011, adding a number of minor features and fixing a number of bugs. It was mostly intended to get the ball rolling again and thus low on features.

Version 0.5.1 was a bugfix release, on January 17, 2012, adding a single bugfix and a new default cluster header for thorough mode.

Version 0.5.2 was an minor release, on January 29, 2012, adding a number of optimizations, prefixing error and warning messages with the program name and modifying the default sampling limit.