fxia22 / stn.pytorch

pytorch version of spatial transformer networks
Other
589 stars 87 forks source link

Bug with 'BCHW' version & Question about high CPU utilization when in GPU mode #15

Open huanghoujing opened 7 years ago

huanghoujing commented 7 years ago

Hi, fxia22,

  1. I used the 'BCHW' version of STN for affine transformation (to rotate an image), however, the output image does not have the same size as input. Instead, the output size is abnormal. The input image to the STN is with shape (1, 3, 328, 582), but the returned one is with (1, 3, 582, 2). Here is my simple code:
import sys
import os.path as osp
sys.path.insert(0, osp.expanduser('~/Project/stn.pytorch/script'))

import torch
import numpy as np
from torch.autograd import Variable
from modules.stn import STN
from modules.gridgen import AffineGridGenV2
import matplotlib.pyplot as plt

img = plt.imread(osp.expanduser('~/Project/stn.pytorch/script/cat.jpg'))

# plt.imshow(img)
# plt.show()

img = img / 255.
# shape [3, H, W]
img_batch = img.transpose(2, 0, 1)
# shape [1, 3, H, W]
img_batch = np.expand_dims(img_batch, 0)
inputImages = Variable(torch.from_numpy(img_batch.astype(np.float32)))

print 'inputImages.size:', inputImages.size()

stn = STN(layout='BCHW')
grid_generator = AffineGridGenV2(328, 582)
trans_mat = Variable(torch.from_numpy(
  np.array([[[np.cos(45./180*np.pi), np.sin(45./180*np.pi), 0],
             [np.sin(-45./180*np.pi), np.cos(45./180*np.pi), 0]]],
           dtype=np.float32)),
  requires_grad = True)
grid = grid_generator(trans_mat)
res = stn(inputImages, grid)
res = res.data.cpu().numpy()

print 'res.shape:', res.shape

plt.imshow((res[0].transpose(1, 2, 0)*255).astype(np.uint8))
plt.show()
  1. When I used the GPU version of STN to transform features (I used the GPU mode by transferring input features and transformation matrices to cuda), I found it occupied many threads (with different pids) and nearly 800% CPU. I am sure in my side that it is the STN model that takes up the resources. Why the GPU version takes so much CPU? Is it your intentional design or some bug?

I am scratching my head for these two problems, not that clear about the inner implementation. I hope to see your testing result and your kind explanation. Thank you very much.