tuffy / python-audio-tools

A collection of audio handling programs which work from the command line.
http://audiotools.sourceforge.net
GNU General Public License v2.0
249 stars 58 forks source link

Edit PCM data then save as new file #39

Open ghost opened 9 years ago

ghost commented 9 years ago

I worry this may be a slightly basic problem but I've been trying for days and cannot seem to find a solution in the documentation.

I am trying to use audiotools in python to open audio files (at the moment just .wav), apply signal processing to the PCM data, then save as a new file

I have managed to read the data and convert to a FramesList of floating point values which is perfect for editing, though I cannot work out the process of writing the processed FramesList back to a .wav file. to_bytes() converts the data into a raw pcm string but I am unsure how to get this raw data back into a wav file?

    from audiotools import *
    from argparse import ArgumentParser

    def get_info(audio_file, main_args):
        """
        create a dictionary of information for the audiofile object.

        """
        info = {}
        info["channels"] = audio_file.channels()
        info["channel_mask"] = audio_file.channel_mask()
        info["bits"] = audio_file.bits_per_sample()
        info["sample_rate"] = audio_file.sample_rate()
        info["frames"] = audio_file.total_frames()
        info["length"] = audio_file.seconds_length()
        info["seekable"] = audio_file.seekable()
        info["verified"] = audio_file.verify()
        info["chunks"] = audio_file.has_foreign_wave_chunks()
        info["available"] = audio_file.available(BIN)
        info["header"], info["footer"] = audio_file.wave_header_footer()

        if main_args.verbose:
            print "No. of Channels:\t\t", info["channels"]
            print "Channel mask:\t\t\t", info["channel_mask"]
            print "Bits per sample:\t\t", info["bits"], "BIT"
            print "Sample Rate:\t\t\t", (info["sample_rate"]/1000.0), "k"
            print "Number of Frames:\t\t", info["frames"]
            print "Audio Length:\t\t\t", info["length"], "seconds"
            print "Audio File Seekable?:\t\t", info["seekable"]
            print "File has foreign chunks?:\t", info["chunks"]
            print "Correct Binaries present?:\t", info["available"]
        return info

    def main():
        parser = ArgumentParser()
        parser.add_argument(
            "-v",
            "--verbose",
            help = "Run program verbosely",
            default = False,
            action = "store_true",
        )

        main_args = parser.parse_args()

        #open audio file as an AudioFile object
        audio_file =  open("/Users/samperry/piano2.wav")
        file_info = get_info(audio_file, main_args)

        #Creates a WaveReader object from the AudioFile Object
        pcm_data = audio_file.to_pcm()

        #Creates a FrameList object from WaveReader object. Currently reads all
        #frames in file
        frame_list = pcm_data.read(file_info["frames"])

        #Convert samples to floats (-1.0 - +1.0)
        float_frame_list = frame_list.to_float()

        #eventually do some signal processing here...

        #Convert back to integer FrameList
        output_framelist = float_frame_list.to_int(file_info["bits"])

        #now back to raw bytes
        output_data = output_framelist.to_bytes(False, True)

    if __name__ == "__main__":
        main()
tuffy commented 9 years ago

At present, all the encoders are implemented by "pulling" data from file-like PCMReaders rather than having data "pushed" into them. So the implementation of a signal processor looks a lot like:

class Processor:
    def __init__(self, pcmreader):
        "pcmreader is some other reader to be processed"

        # assuming the stream's parameters won't be changed
        self.sample_rate = pcmreader.sample_rate
        self.channels = pcmreader.channels
        self.channel_mask = pcmreader.channel_mask
        self.bits_per_sample = pcmreader.bits_per_sample

        # save reader to extract data from
        self.pcmreader = pcmreader

def read(self, pcm_frames):
        # grab data from reader and convert it to floats between (-1.0 - +1.0)
        frame_list = self.pcmreader.read(pcm_frames).to_float()

        # perform signal processing here...

        # convert samples back to integers for the encoder
        return frame_list.to_int(self.bits_per_sample)

def close(self):
        # close our contained reader when done
        self.pcmreader.close()

So encoding is then a matter of calling:

old_wave = audiotools.open("some_file.wav")
new_wave = audiotools.WaveAudio.from_pcm(filename, Processor(old_wave.to_pcm()))

Encoders check those 4 stream parameters (sample_rate, bits_per_sample, etc. - which are ints) to determine what the stream is. They then encode the data returned by .read() until an empty FrameList is returned. Finally, .close() is called to clean up any open file objects the stream might have.

Also, encoders aren't picky about how much data is returned by each .read() call; they will break the stream into chunks as needed. All that matters is that the last FrameList is empty.

I hope this is of some use!