SoundTouch library Copyright Olli Parviainen 2001-2014
SoundTouch is an open-source audio processing library that allows changing the sound tempo, pitch and playback rate parameters independently from each other, i.e.:
Author email: oparviai 'at' iki.fi
SoundTouch WWW page: http://soundtouch.surina.net
Before compiling, notice that you can choose the sample data format if it's desirable to use floating point sample data instead of 16bit integers. See section "sample data format" for more information.
Project files for Microsoft Visual C++ are supplied with the source code package. Go to Microsoft WWW page to download Microsoft Visual Studio Express version for free.
To build the binaries with Visual C++ compiler, either run "make-win.bat" script, or open the appropriate project files in source code directories with Visual Studio. The final executable will appear under the "SoundTouch\bin" directory. If using the Visual Studio IDE instead of the make-win.bat script, directories bin and lib may need to be created manually to the SoundTouch package root for the final executables. The make-win.bat script creates these directories automatically.
The SoundTouch library compiles in practically any platform supporting GNU compiler (GCC) tools. SoundTouch requires GCC version 4.3 or later.
To build and install the binaries, run the following commands in /soundtouch directory:
|
./bootstrap -
Creates "configure" file with local autoconf/automake toolset. |
---|
./configure -
|
Configures the SoundTouch package for the local environment. Notice that "configure" file is not available before running the "./bootstrap" command as above.
make -
|
Builds the SoundTouch library & SoundStretch utility.
make install -
|
Installs the SoundTouch & BPM libraries to /usr/local/lib and SoundStretch utility to /usr/local/bin. Please notice that 'root' privileges may be required to install the binaries to the destination locations.
|
Bash shell, GNU C++ compiler, libtool, autoconf and automake tools are required for compiling the SoundTouch library. These are usually included with the GNU/Linux distribution, but if not, install these packages first. For example, Ubuntu Linux can acquire and install these with the following command:
**sudo apt-get install automake autoconf libtool build-essential**
At the release time the SoundTouch package has been tested to compile in GNU/Linux platform. However, If you have problems getting the SoundTouch library compiled, try disabling optimizations that are specific for x86 processors by running ./configure script with switch
--enable-x86-optimizations=no
Alternatively, if you don't use GNU Configure system, edit file "include/STTypes.h" directly and remove the following definition:
#define SOUNDTOUCH_ALLOW_X86_OPTIMIZATIONS 1
The GNU compilation does not automatically create a shared-library version of SoundTouch (.so or .dll). If such is desired, then you can create it as follows after running the usual compilation:
g++ -shared -static -DDLL_EXPORTS -I../../include -o SoundTouch.dll \ SoundTouchDLL.cpp ../SoundTouch/.libs/libSoundTouch.a sstrip SoundTouch.dll
Android compilation instructions are within the source code package, see file "source/Android-lib/README-SoundTouch-Android.html" in the package.
The sample data format can be chosen between 16bit signed integer and 32bit floating point values, the default is 32bit floating point.
In Windows environment, the sample data format is chosen in file "STTypes.h" by choosing one of the following defines:
In GNU environment, the floating sample format is used by default, but integer sample format can be chosen by giving the following switch to the configure script:
./configure --enable-integer-samples
The sample data can have either single (mono) or double (stereo) audio channel. Stereo data is interleaved so that every other data value is for left channel and every second for right channel. Notice that while it'd be possible in theory to process stereo sound as two separate mono channels, this isn't recommended because processing the channels separately would result in losing the phase coherency between the channels, which consequently would ruin the stereo effect.
Sample rates between 8000-48000H are supported.
The processing and latency constraints of the SoundTouch library are:
SoundTouch provides three seemingly independent effects: tempo, pitch and playback rate control. These three controls are implemented as combination of two primary effects, sample rate transposing and time-stretching.
Sample rate transposing affects both the audio stream duration and pitch. It's implemented simply by converting the original audio sample stream to the desired duration by interpolating from the original audio samples. In SoundTouch, linear interpolation with anti-alias filtering is used. Theoretically a higher-order interpolation provide better result than 1st order linear interpolation, but in audio application linear interpolation together with anti-alias filtering performs subjectively about as well as higher-order filtering would.
Time-stretching means changing the audio stream duration without affecting it's pitch. SoundTouch uses WSOLA-like time-stretching routines that operate in the time domain. Compared to sample rate transposing, time-stretching is a much heavier operation and also requires a longer processing "window" of sound samples used by the processing algorithm, thus increasing the algorithm input/output latency. Typical i/o latency for the SoundTouch time-stretch algorithm is around 100 ms.
Sample rate transposing and time-stretching are then used together to produce the tempo, pitch and rate controls:
The time-stretch algorithm has few parameters that can be tuned to optimize sound quality for certain application. The current default parameters have been chosen by iterative if-then analysis (read: "trial and error") to obtain best subjective sound quality in pop/rock music processing, but in applications processing different kind of sound the default parameter set may result into a sub-optimal result.
The time-stretch algorithm default parameter values are set by the following #defines in file "TDStretch.h":
#define DEFAULT_SEQUENCE_MS AUTOMATIC #define DEFAULT_SEEKWINDOW_MS AUTOMATIC #define DEFAULT_OVERLAP_MS 8
These parameters affect to the time-stretch algorithm as follows:
DEFAULT_SEQUENCE_MS: This is the default length of a single processing sequence in milliseconds which determines the how the original sound is chopped in the time-stretch algorithm. Larger values mean fewer sequences are used in processing. In principle a larger value sounds better when slowing down the tempo, but worse when increasing the tempo and vice versa.
By default, this setting value is calculated automatically according to tempo value.
DEFAULT_SEEKWINDOW_MS: The seeking window default length in milliseconds is for the algorithm that seeks the best possible overlapping location. This determines from how wide a sample "window" the algorithm can use to find an optimal mixing location when the sound sequences are to be linked back together.
The bigger this window setting is, the higher the possibility to find a better mixing position becomes, but at the same time large values may cause a "drifting" sound artifact because neighboring sequences can be chosen at more uneven intervals. If there's a disturbing artifact that sounds as if a constant frequency was drifting around, try reducing this setting.
By default, this setting value is calculated automatically according to tempo value.
DEFAULT_OVERLAP_MS: Overlap length in milliseconds. When the sound sequences are mixed back together to form again a continuous sound stream, this parameter defines how much the ends of the consecutive sequences will overlap with each other.
This shouldn't be that critical parameter. If you reduce the DEFAULT_SEQUENCE_MS setting by a large amount, you might wish to try a smaller value on this.
Notice that these parameters can also be set during execution time with functions "TDStretch::setParameters()" and "SoundTouch::setSetting()".
The table below summaries how the parameters can be adjusted for different applications:
Parameter name | Default value magnitude | Larger value affects... | Smaller value affects... | Effect to CPU burden |
---|
SEQUENCE_MS
Default value is relatively large, chosen for slowing down music tempo | Larger value is usually better for slowing down tempo. Growing the value decelerates the "echoing" artifact when slowing down the tempo. | Smaller value might be better for speeding up tempo. Reducing the value accelerates the "echoing" artifact when slowing down the tempo | Increasing the parameter value reduces computation burden |
---|
SEEKWINDOW_MS
Default value is relatively large, chosen for slowing down music tempo | Larger value eases finding a good mixing position, but may cause a "drifting" artifact | Smaller reduce possibility to find a good mixing position, but reduce the "drifting" artifact. | Increasing the parameter value increases computation burden |
---|
OVERLAP_MS
| Default value is relatively large, chosen to suit with above parameters. | If you reduce the "sequence ms" setting, you might wish to try a smaller value. | Increasing the parameter value increases computation burden |
General optimizations:
The time-stretch routine has a 'quick' mode that substantially speeds up the algorithm but may degrade the sound quality by a small amount. This mode is activated by calling SoundTouch::setSetting() function with parameter id of SETTING_USE_QUICKSEEK and value "1", i.e.
setSetting(SETTING_USE_QUICKSEEK, 1);
CPU-specific optimizations:
Intel MMX optimized routines are used with compatible CPUs when 16bit integer sample type is used. MMX optimizations are available both in Win32 and Gnu/x86 platforms. Compatible processors are Intel PentiumMMX and later; AMD K6-2, Athlon and later.
Intel SSE optimized routines are used with compatible CPUs when floating point sample type is used. SSE optimizations are currently implemented for Win32 platform only. Processors compatible with SSE extension are Intel processors starting from Pentium-III, and AMD processors starting from Athlon XP.
AMD 3DNow! optimized routines are used with compatible CPUs when floating point sample type is used, but SSE extension isn't supported . 3DNow! optimizations are currently implemented for Win32 platform only. These optimizations are used in AMD K6-2 and Athlon (classic) CPU's; better performing SSE routines are used with AMD processor starting from Athlon XP.
SoundStretch audio processing utility Copyright (c) Olli Parviainen 2002-2012
SoundStretch is a simple command-line application that can change tempo, pitch and playback rates of WAV sound files. This program is intended primarily to demonstrate how the "SoundTouch" library can be used to process sound in your own program, but it can as well be used for processing sound files.
SoundStretch Usage syntax:
soundstretch infilename outfilename [switches]
Where:
|
"infilename"
Name of the input sound data file (in .WAV audio file format). Give "stdin" as filename to use standard input pipe. |
---|
"outfilename"
Name of the output sound file where the resulting sound is saved (in .WAV audio file format). This parameter may be omitted if you don't want to save the output (e.g. when only calculating BPM rate with '-bpm' switch). Give "stdout" as filename to use standard output pipe. |
---|
[switches]
| Are one or more control switches. |
Available control switches are:
|
-tempo=n
Change the sound tempo by n percents (n = -95.0 .. +5000.0 %) |
---|
-pitch=n
Change the sound pitch by n semitones (n = -60.0 .. + 60.0 semitones) |
---|
-rate=n
Change the sound playback rate by n percents (n = -95.0 .. +5000.0 %) |
---|
-bpm=n
Detect the Beats-Per-Minute (BPM) rate of the sound and adjust the tempo to meet 'n' BPMs. When this switch is applied, the "-tempo" switch is ignored. If "=n" is omitted, i.e. switch "-bpm" is used alone, then the BPM rate is estimated and displayed, but tempo not adjusted according to the BPM value. |
---|
-quick
Use quicker tempo change algorithm. Gains speed but loses sound quality. |
---|
-naa
Don't use anti-alias filtering in sample rate transposing. Gains speed but loses sound quality. |
---|
-license
| Displays the program license text (LGPL) |
Notes:
Example 1
The following command increases tempo of the sound file "originalfile.wav" by 12.5% and stores result to file "destinationfile.wav":
soundstretch originalfile.wav destinationfile.wav -tempo=12.5
Example 2
The following command decreases the sound pitch (key) of the sound file "orig.wav" by two semitones and stores the result to file "dest.wav":
soundstretch orig.wav dest.wav -pitch=-2
Example 3
The following command processes the file "orig.wav" by decreasing the sound tempo by 25.3% and increasing the sound pitch (key) by 1.5 semitones. Resulting .wav audio data is directed to standard output pipe:
soundstretch orig.wav stdout -tempo=-25.3 -pitch=1.5
Example 4
The following command detects the BPM rate of the file "orig.wav" and adjusts the tempo to match 100 beats per minute. Result is stored to file "dest.wav":
soundstretch orig.wav dest.wav -bpm=100
Example 5
The following command reads .wav sound data from standard input pipe and estimates the BPM rate:
soundstretch stdin -bpm
1.8.0:
1.7.1:
1.7.0:
1.6.0:
Added automatic cutoff threshold adaptation to beat detection routine to better adapt BPM calculation to different types of music
Retired 3DNow! optimization support as 3DNow! is nowadays obsoleted and assembler code is nuisance to maintain
Retired "configure" file from source code package due to autoconf/automake versio conflicts, so that it is from now on to be generated by invoking "boostrap" script that uses locally available toolchain version for generating the "configure" file
Resolved namespace/label naming conflicts with other libraries by replacing global labels such as INTEGER_SAMPLES with more specific SOUNDTOUCH_INTEGER_SAMPLES etc.
Updated windows build scripts & project files for Visual Studio 2008 support
Updated SoundTouch.dll API for .NET compatibility
Added API for querying nominal processing input & output sample batch sizes
1.5.0:
1.4.1:
1.4.0:
1.3.1:
1.3.0:
1.2.1:
1.2.0:
1.1.1:
1.0.1:
1.0:
1.7.0:
1.5.0:
1.4.0:
1.3.0:
1.2.1:
1.2.0:
1.1.1:
1.1:
1.01:
Initial release
Kudos for these people who have contributed to development or submitted bugfixes since SoundTouch v1.3.1:
Moral greetings to all other contributors and users also!
SoundTouch audio processing library Copyright (c) Olli Parviainen
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License version 2.1 as published by the Free Software Foundation.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
RREADME.html file updated on 7-Jan-2014