NVIDIA / vid2vid

Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.
Other
8.58k stars 1.2k forks source link

Resample2d_cuda_forward not found #15

Open martync opened 6 years ago

martync commented 6 years ago

Hi,

I'm running this command :

python test.py --name label2city_2048 --loadSize 2048 --n_scales_spatial 3 --use_instance --fg --use_single_G

and I got this error :

------------ Options -------------
aspect_ratio: 1.0
batchSize: 1
checkpoints_dir: ./checkpoints
dataroot: datasets/Cityscapes/
dataset_mode: temporal
debug: False
display_id: 0
display_winsize: 512
feat_num: 3
fg: True
fg_labels: [26]
fineSize: 512
gpu_ids: [0]
how_many: 300
input_nc: 3
isTrain: False
label_feat: False
label_nc: 35
loadSize: 2048
load_features: False
load_pretrain: 
max_dataset_size: inf
model: vid2vid
nThreads: 2
n_blocks: 9
n_blocks_local: 3
n_downsample_E: 3
n_downsample_G: 3
n_frames_G: 3
n_gpus_gen: 1
n_local_enhancers: 1
n_scales_spatial: 3
name: label2city_2048
ndf: 64
nef: 32
netE: simple
netG: composite
ngf: 128
no_first_img: False
no_flip: False
norm: batch
ntest: inf
output_nc: 3
phase: test
resize_or_crop: scaleWidth
results_dir: ./results/
serial_batches: False
tf_log: False
use_instance: True
use_real_img: False
use_single_G: True
which_epoch: latest
-------------- End ----------------
CustomDatasetDataLoader
dataset [TestDataset] was created
vid2vid
---------- Networks initialized -------------
-----------------------------------------------
Doing 560 frames
Exception ignored in: <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f55f562fc50>>
Traceback (most recent call last):
  File "/home/studio/.virtualenvs/vid2vid/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 399, in __del__
    self._shutdown_workers()
  File "/home/studio/.virtualenvs/vid2vid/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 378, in _shutdown_workers
    self.worker_result_queue.get()
  File "/usr/lib/python3.6/multiprocessing/queues.py", line 337, in get
    return _ForkingPickler.loads(res)
  File "/home/studio/.virtualenvs/vid2vid/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 151, in rebuild_storage_fd
    fd = df.detach()
  File "/usr/lib/python3.6/multiprocessing/resource_sharer.py", line 57, in detach
    with _resource_sharer.get_connection(self._id) as conn:
  File "/usr/lib/python3.6/multiprocessing/resource_sharer.py", line 87, in get_connection
    c = Client(address, authkey=process.current_process().authkey)
  File "/usr/lib/python3.6/multiprocessing/connection.py", line 494, in Client
    deliver_challenge(c, authkey)
  File "/usr/lib/python3.6/multiprocessing/connection.py", line 722, in deliver_challenge
    response = connection.recv_bytes(256)        # reject large message
  File "/usr/lib/python3.6/multiprocessing/connection.py", line 216, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "/usr/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
    buf = self._recv(4)
  File "/usr/lib/python3.6/multiprocessing/connection.py", line 379, in _recv
    chunk = read(handle, remaining)
ConnectionResetError: [Errno 104] Connection reset by peer
Traceback (most recent call last):
  File "test.py", line 43, in <module>
    generated = model.inference(A, B, inst)
  File "/home/studio/HAH/AI_libs/vid2vid/models/vid2vid_model_G.py", line 198, in inference
    fake_B = self.generate_frame_infer(real_A[self.n_scales-1-s], s)
  File "/home/studio/HAH/AI_libs/vid2vid/models/vid2vid_model_G.py", line 216, in generate_frame_infer
    self.fake_B_feat, self.flow_feat, self.fake_B_fg_feat, use_raw_only)    
  File "/home/studio/HAH/AI_libs/vid2vid/models/networks.py", line 173, in forward
    img_warp = self.resample(img_prev[:,-3:,...].cuda(gpu_id), flow).cuda(gpu_id)        
  File "/home/studio/.virtualenvs/vid2vid/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/studio/HAH/AI_libs/vid2vid/models/flownet2_pytorch/networks/resample2d_package/modules/resample2d.py", line 14, in forward
    return Resample2dFunction.apply(input1_c, input2, self.kernel_size)
  File "/home/studio/HAH/AI_libs/vid2vid/models/flownet2_pytorch/networks/resample2d_package/functions/resample2d.py", line 19, in forward
    resample2d.Resample2d_cuda_forward(input1, input2, output, kernel_size)
AttributeError: module 'models.flownet2_pytorch.networks.resample2d_package._ext.resample2d' has no attribute 'Resample2d_cuda_forward'

I'm trying to fix this for hours now, no way.

Here is my pip freeze :

tcwang0509 commented 6 years ago

Can you pull the latest code and try again?

martync commented 6 years ago

Sorry for the late answer. I've downgraded to Ubuntu 16.04 and Cuda 9.0 (still with torch 0.4.1) Still had this error with the last vid2vid source. I've downloaded the last flownet2 source (commit 532613d4fa46e544ddc309a8aa9e6b65dc91af21) and bash install.sh ran successfully. Don't know if it could help

dustinfreeman commented 6 years ago

I believe CUDA 8 is expected? At leasts that’s what I got working in my pull request #23

zhangboyang commented 5 years ago

I encountered the same problem when I was using the Dockerfile in this repo. I found it was because install.sh was not executed successfully. (May be I didn't use nvidia-docker to build the image) I used the following command to solve the problem.

cd models/flownet2_pytorch
bash install.sh