torch / torch7

http://torch.ch
Other
8.98k stars 2.38k forks source link

bugs in multinomial? #165

Open andresy opened 9 years ago

andresy commented 9 years ago

Hi @nicholas-leonard

Looking closely at multinomial, I see some strange behaviors...

> p = torch.rand(10)
> torch.multinomial(p, 3) -- ok
> torch.multinomial(p, 11) -- oops, error
> print(p:size())
  1
 10
[torch.LongStorage of size 2]

something wrong happened!

require 'torch'
require 'gnuplot'

torch.manualSeed(5555)

local C = 80
local N = 100000

-- some unbalanced data
local p = torch.Tensor(C)
for i=1,C do
   p[i] = i
end
local x = torch.multinomial(p, N, true)
gnuplot.axis{'','',0,''}
gnuplot.hist(x)

image

nicholas-leonard commented 9 years ago

@andresy I think there is a mistake concerning the default value of the replacement flag. Try this:

torch.multinomial(p, 11, true)
 1  8  3  2  1  3  4  3  3  4  3
[torch.LongTensor of dimension 1x11]

So it's actually trying to sample without replacement by default.

andresy commented 9 years ago

yes, i know, but what i mean if is that if an error occurs (like the one which occurs when forgetting the flag=true), then p is messed up. that should not happen.

[the second bug harder]

nicholas-leonard commented 9 years ago

This argcheck should have triggered : https://github.com/torch/torch7/blob/ef6a263b033831e0cd3435e6c412a809d0327c12/lib/TH/generic/THTensorRandom.c#L81

andresy commented 9 years ago

(in my example p should be still of size 10, not size 1x10)

andresy commented 9 years ago

yes, argcheck triggers, error occurs, but p is messed up. boo.

nicholas-leonard commented 9 years ago

Ah I see.

nicholas-leonard commented 9 years ago

I have to update TREPL to see it trigger...

andresy commented 9 years ago

:smile:

ok, for the second bug, i am not sure it is a bug in your code, or some subtlety in the random number generator.

nicholas-leonard commented 9 years ago

I am looking into that second part. Looks bad...

nicholas-leonard commented 9 years ago

@andresy Could it be a problem with gnuplot.hist?

require 'torch'
require 'gnuplot'
require 'torchx'
_ = require 'moses'

torch.manualSeed(5555)

local C = 80
local N = 1000000

-- some unbalanced data
local p = torch.range(1,C):double()
local x = torch.multinomial(p, N, true)

g = torch.group(x)
gc = _.map(g, function(k,v) return v.idx:size(1) end)
gnuplot.plot('Beuhp',p:type('torch.IntTensor'),torch.FloatTensor(gc),'|')

beuhp