jeong-tae / RACNN-pytorch

This is a third party implementation of RA-CNN in pytorch.
201 stars 63 forks source link

TypeError: expected Tensor as element 0 in argument 0, but got list #11

Closed fxle closed 4 years ago

fxle commented 5 years ago

I fix the batch_size=1,there are some following issues,Can you tell me how to solve them??

[] pre_apn_epoch[13], || pre_apn_iter 19980 || pre_apn_loss: 0.1223 || Timer: 0.1521sec [] Swtich optimize parameters to Class Traceback (most recent call last): File "/home/alex/.local/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 3265, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 1, in runfile('/home/alex/Datas/code/RACNN-pytorch/trainer.py', wdir='/home/alex/Datas/code/RACNN-pytorch') File "/usr/local/pycharm-2018.2.4/helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile pydev_imports.execfile(filename, global_vars, local_vars) # execute the script File "/usr/local/pycharm-2018.2.4/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/home/alex/Datas/code/RACNN-pytorch/trainer.py", line 311, in train() File "/home/alex/Datas/code/RACNN-pytorch/trainer.py", line 136, in train test(testloader, iteration) File "/home/alex/Datas/code/RACNN-pytorch/trainer.py", line 292, in test test_apn_losses = torch.stack(test_apn_losses).mean() TypeError: expected Tensor as element 0 in argument 0, but got list

jeong-tae commented 5 years ago

i updated. can you check this error is solved?

fxle commented 5 years ago

I have checked this update, but there are still problems.Do you know what the problem is? In addition, is your mailbox available(make8286@naver.com)? I suggest that we can use e-mail or other means to achieve timely communication. [] pre_apn_epoch[13], || pre_apn_iter 19980 || pre_apn_loss: 0.0925 || Timer: 0.1519sec [] Swtich optimize parameters to Class Traceback (most recent call last): File "/home/alex/Experiment/RACNN-pytorch/trainer.py", line 306, in train() File "/home/alex/Experiment/RACNN-pytorch/trainer.py", line 129, in train new_apn_loss = pairwise_ranking_loss(preds) File "/home/alex/Experiment/RACNN-pytorch/models/Loss.py", line 20, in pairwise_ranking_loss return torch.zeros(1).type(preds.type())

AttributeError: 'list' object has no attribute 'type'

Exception ignored in: <bound method _DataLoaderIter.del of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f065e408c50>> Traceback (most recent call last): File "/home/alex/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 399, in del self._shutdown_workers() File "/home/alex/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 378, in _shutdown_workers self.worker_result_queue.get() File "/usr/lib/python3.5/multiprocessing/queues.py", line 345, in get return ForkingPickler.loads(res) File "/home/alex/.local/lib/python3.5/site-packages/torch/multiprocessing/reductions.py", line 151, in rebuild_storage_fd fd = df.detach() File "/usr/lib/python3.5/multiprocessing/resource_sharer.py", line 57, in detach with _resource_sharer.get_connection(self._id) as conn: File "/usr/lib/python3.5/multiprocessing/resource_sharer.py", line 87, in get_connection c = Client(address, authkey=process.current_process().authkey) File "/usr/lib/python3.5/multiprocessing/connection.py", line 487, in Client c = SocketClient(address) File "/usr/lib/python3.5/multiprocessing/connection.py", line 614, in SocketClient s.connect(address) ConnectionRefusedError: [Errno 111] Connection refused

Process finished with exit code 1

jeong-tae commented 5 years ago

yes. mail address is valid, but i can't focus on issue timely. i am working on other project right now. This repo is just for my hobby

fxle commented 5 years ago

Well, thank you very much ,Looking forward to your update.

LXYTSOS commented 5 years ago

Have you solved this problem? I encountered the same problem.

fxle commented 5 years ago

I used vgg16 instead,but loss parameters do not converge.How about your experiment results now? @LXYTSOS

LXYTSOS commented 5 years ago

I got another problem, RuntimeError: cuda runtime error (2) : out of memory at /pytorch/aten/src/THC/generic/THCStorage.cu:58

fxle commented 5 years ago

Try to fix the batch-size to low,and keep your cuda memory enough

LXYTSOS commented 5 years ago

I fixed the batch-size to one, If I only use one GPU, I'll got this error when "Swtich optimize parameters to APN", and if I use multi-GPU, no matter how many GPUs I use, I'll got this error after one batch training. check this #7

jeong-tae commented 5 years ago

@fxle i fixed the code and... sorry to tell this. For now, only cuda computation is supported to train. Not for CPU. To calculate loss, i have to know type of tensor but some case, there is no prediction so that there is no chance to check input type. If you use GPU, it will be fine.