hughperkins / pytorch

Python wrappers for torch and lua
BSD 2-Clause "Simplified" License
431 stars 70 forks source link

Numpy slicing affects torch Tensor? #20

Open Tushar-N opened 8 years ago

Tushar-N commented 8 years ago

EDIT: Sorry, I think this is already addressed in this issue. I'm not too sure if it's because of the same reason or not. Apologies if this issue is a duplicate (I can't find a delete button!)

Hi,

I noticed a small issue when I slice np arrays and send them over to torch using pytorch - some of the values get set to inf on the torch side. Here's a small piece of code that replicates what I observed:

-- echo.lua : a simple class with a method to just echo back a tensor
require 'torch'
require 'nn'

local Echo = torch.class('Echo')

function Echo:__init()
    -- Dummy init
end

function Echo:echo(nparray)
    return nparray
end
# test.py: script to generate a bunch of random tensors,
# slice them, and see if they if they're echo'd back normally
import PyTorchHelpers
import numpy as np

Echo = PyTorchHelpers.load_lua_class('echo.lua', 'Echo')
net = Echo()

for i in range(1000):
    arr = np.random.rand(10,100)
    arr_slice = arr[:,1:] # arbitrary slice
    echo = net.echo(arr_slice)  
    print np.sum(arr_slice), np.sum(echo.asNumpyTensor())

My output looks like:

...
517.576931197 0.0
483.236528627 0.0
487.247049613 0.0
487.437043052 -4.98271150804e+291
503.993869064 0.0
497.493831614 0.0
...

Note that if I either: (1) Don't slice the array or (2) slice, but also multiply by 1.0 (arr_slice = 1.0*arr[:,1:]), then the issue disappears. Any idea why? (I'm using python2.7)

PS: I've been juggling between fbtorch, several forks of lunatic-python, and putting up http servers in lua and querying from python. It's been a nightmare so far. Thank you so much for putting up this repo!

aosokin commented 7 years ago

Hi, I've recently observed something weird which might be related. I'm defining a network, and in forward() I have a slicing operation separating the first half of my channels:

first_half = input[:, :self.n_input_ch // 2, :, :]

This operation totally breaks down the training, i.e., after two or three training iterations the loss goes to nan.

However, if I do first_half = input[:, :self.n_input_ch // 2, :, :] * 1.0 instead, then everything works fine.

Is this a bug of slicing is not supposed to be used this way?