Support for require 'audio'

willfrey commented 8 years ago

Defining custom get and processor functions using the torch audio library doesn't seem to work. This could be because I am unable to successfully do local audio = require 'audio' and instead can only do require 'audio'.

This would be very useful for loading audio file data using your tool!

zakattacktwitter commented 8 years ago

Hi,

Can you elaborate on what you would like to see from a custom 'get' function?

Also, if you turn on verbose mode when creating the sampledBatcher it should output better errors when the get or processor functions fail (they are wrapped in pcalls).

Here's an example of turning on verbose:

local dataset = Dataset('http://d3jod65ytittfm.cloudfront.net/dataset/mnist/train.t7') local getBatch = dataset.sampledBatcher({ batchSize = 1, inputDims = { 1 }, verbose = true, processor = function(res, processorOpt, input) local audio = require 'audio' -- if this fails you should see an error error('halt!') -- this will definitely fail.. end, }) getBatch() -- you should see an error print...

soumith commented 8 years ago

@willfrey just patched the audio package to return, so: luarocks install audio

after that you can do in your code: local audio = require 'audio'

willfrey commented 8 years ago

My custom get function reads in 16-bit wav files to a ShortTensor and returns the signal for my processor to process.

Here's an example of what I'm trying to do:

processor = function(res, processorOpt, input) require 'audio' input:zero() -- get signal and normalize local signal = torch.DoubleTensor(#res, 1) signal:storage():copy(res) local mean, std mean = torch.mean(signal) std = torch.std(signal) signal:add(-mean):div(std) -- get spectrogram local spect, numFrames numFrames = math.floor(1 + (#res - 320) / 160) spect = torch.DoubleTensor(numFrames, 161):zero() spect:copy(audio.spectrogram(signal, 320, 'hann', 160)) input[1]:narrow(1, 1, numFrames):copy(spect) return true end,

If I run this with a batch size of 1, I get this error:

\ Error in `/mnt/torch/install/bin/luajit': double free or corruption (fasttop): 0x00007f571003fd00 *\ Aborted (core dumped)

If I run this with a batch size of 2 or more, I get this error:

Segmentation fault (core dumped)

willfrey commented 8 years ago

Thanks, Soumith!

Trying it out now.

zakattacktwitter commented 8 years ago

Cool,

The require of audio should work regardless of a local or not, the processor function runs in a clean lua environment.

OK, a couple things.

Can you include the code where you create the sampledBatcher, would like to to see the args passed in?

From looking at your processor, your mini-batch items will all have different length tensors. That's supported but you have to handle it specially. You have two choices. First, keep batchSize == 1, that allows you to resize the input tensor in the processor function. Or, secondly, pad all your mini batch items to a fixed size and then fill the unused portions with zeros.

One last thing, not sure why you are doing input[1]:narrow, I'd need to see the args when creating the batcher to understand more.

willfrey commented 8 years ago

Here's what I'm doing:

function getWav(url, offset, length) local f = torch.DiskFile(url, 'r'):binary() if f ~= nil then local contents f:seek(offset + 1) contents = f:readShort(length) f:close() return contents end end

getBatch, numBatches = dataset.sampledBatcher({ samplerKind = 'linear', batchSize = 2, inputDims = { 1, 3000, 161 }, verbose = true, inputTensorType = torch.DoubleTensor, get = getWav, processor = function(res, processorOpt, input) local audio = require 'audio' input:zero() -- get signal and normalize local signal = torch.DoubleTensor(#res, 1) signal:storage():copy(res) local mean, std mean = torch.mean(signal) std = torch.std(signal) signal:add(-mean):div(std) -- get spectrogram local spect, numFrames numFrames = math.floor(1 + (#res - 320) / 160) spect = torch.DoubleTensor(numFrames, 161):zero() spect:copy(audio.spectrogram(signal, 320, 'hann', 160)) input[1][{1, numFrames}] return true end, })

zakattacktwitter commented 8 years ago

Can you try setting

batchSize = 1, inputDims = { 1 },

And then at then end of your processor do:

input:resize(spect:size()):copy(spect) return true

That should at least get you running. If it still crashes then I would add a bunch of print statements sprinkled throughout your processor to see which line is causing trouble.

willfrey commented 8 years ago

Yup, that works.

Thank you!

Do you think there's any hope for this being able to return a batch of multiple padded samples?

zakattacktwitter commented 8 years ago

Definitely, now that we have that working we can re-enable the rest. As a next step I would put the args to:

batchSize = 2, inputDims = { 3000, 161 },

And then at the end of your processor function do:

assert(spect:size(1) <= 3000) assert(spect:size(2) == 161) input:narrow(1, 1, spect:size(1)]:copy(spect) return true

willfrey commented 8 years ago

With that I get Segmentation fault (core dumped)

soumith commented 8 years ago

i really hope this is not going to be a bug in the audio package. if it is, i apologize in advance :)

zakattacktwitter commented 8 years ago

I doubt it is, just some tensor copy stuff needs debugging. I'll construct a test in the AM.

On Thursday, February 4, 2016, Soumith Chintala notifications@github.com wrote:

i really hope this is not going to be a bug in the audio package. if it is, i apologize in advance :)

— Reply to this email directly or view it on GitHub https://github.com/twitter/torch-dataset/issues/12#issuecomment-180090509 .

zakattacktwitter commented 8 years ago

Here's a little example I made of variable sized items in a mini-batch. Hope it helps!

https://gist.github.com/zakattacktwitter/2da0ae129f3c1c10bff5

willfrey commented 8 years ago

Your example works for me. I also got this following code to work for a batch size of 1 but it tells me Segmentation fault (core dumped) for batch size >=2.

function getWav(url, offset, length) local f = torch.DiskFile(url, 'r'):binary() if f ~= nil then local contents f:seek(offset + 1) contents = f:readShort(length) f:close() return contents end end

getBatch, numBatches = dataset.sampledBatcher({ samplerKind = 'linear', batchSize = 1, inputDims = { 1, 3000, 161 }, verbose = true, inputTensorType = torch.DoubleTensor, get = getWav, processor = function(res, processorOpt, input) local audio = require 'audio' -- get signal local signal = torch.DoubleTensor(#res, 1) signal:storage():copy(res) -- normalize signal local mean = torch.mean(signal) local std = torch.std(signal) signal:add(-mean):div(std) -- get spectrogram local spect = audio.spectrogram(signal, 320, 'hann', 160):t() -- copy spectrogram input:fill(0) input[1]:narrow(1, 1, spect:size(1)):copy(spect) return true end, })

zakattacktwitter commented 8 years ago

Sorry you having so much trouble. I’d keep commenting out code until your version works, keep reducing until you can narrow down the error.

On Fri, Feb 5, 2016 at 9:46 AM, willfrey notifications@github.com wrote:

Your example works for me. I also got this following code to work for a batch size of 1 but it tells me Segmentation fault (core dumped) for batch size >=2.

Define custom get function for wav files function getWav(url, offset, length) local f = torch.DiskFile(url, 'r'):binary() if f ~= nil then local contents f:seek(offset + 1) contents = f:readShort(length) f:close() return contents end end

getBatch, numBatches = dataset.sampledBatcher({ samplerKind = 'linear', batchSize = 1, inputDims = { 1, 3000, 161 }, verbose = true, inputTensorType = torch.DoubleTensor, get = getWav, processor = function(res, processorOpt, input) local audio = require 'audio' -- get signal local signal = torch.DoubleTensor(#res, 1) signal:storage():copy(res) -- normalize signal local mean = torch.mean(signal) local std = torch.std(signal) signal:add(-mean):div(std) -- get spectrogram local spect = audio.spectrogram(signal, 320, 'hann', 160):t() -- copy spectrogram input:fill(0) input[1]:narrow(1, 1, spect:size(1)):copy(spect) return true end, })

— Reply to this email directly or view it on GitHub https://github.com/twitter/torch-dataset/issues/12#issuecomment-180465392 .

willfrey commented 8 years ago

I appreciate the help! I'm doing that now. I'll let you know what I narrow it down to.

willfrey commented 8 years ago

I can get it to work if I have a randomly generated signal and compute a spectrogram from that. That means the issue lies in how I'm copying over the storage from res to my signal variable in the processor.

willfrey commented 8 years ago

Yup. That was the problem. If I change my getWav function to be

function getWav(url, offset, length) local audio = require 'audio' local contents = audio.load(url) return contents end

it works just fine. It's slower but it gets what I need and has the added bonus of being able to read any audio type that Sox can read – not just wav!

Edit: It works sometimes. I still get a segmentation fault error most of the time.

zakattacktwitter commented 8 years ago

Great, glad you got it sorted!

On Fri, Feb 5, 2016 at 10:14 AM, willfrey notifications@github.com wrote:

Yup. That was the problem. If I change my getWav function to be

function getWav(url, offset, length) local audio = require 'audio' local contents = audio.load(url) return contents end

it works just fine. It's slower but it gets what I need and has the added bonus of being able to read any audio type that Sox can read – not just wav!

— Reply to this email directly or view it on GitHub https://github.com/twitter/torch-dataset/issues/12#issuecomment-180478821 .

willfrey commented 8 years ago

Unfortunately it worked twice and then no more.

I'll dig more into what the problem is.

zakattacktwitter commented 8 years ago

Hi, did you figure out the problem?

willfrey commented 8 years ago

Yup. You can close the issue ticket!

Thanks for your help!

On Thu, Feb 11, 2016 at 12:15 PM, zakattacktwitter <notifications@github.com

wrote:

Hi, did you figure out the problem?

— Reply to this email directly or view it on GitHub https://github.com/twitter/torch-dataset/issues/12#issuecomment-182963544 .

Will Frey Software Engineer |* Digital Reasoning* Cell: 703.915.8592 <615.838.0457> 901 N Stuart St., Suite 902 Arlington, VA 22203 www.digitalreasoning.com

twitter-archive / torch-dataset

Support for require 'audio' #12