TypeError: expected Tensor as element 0 in argument 0, but got int

OswaldoBornemann commented 1 year ago

 emd = torch.cat(emd_lst)
          _     _   _ [9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999]
          _     _ <built-in method cat of type object at 0x7f3b1185d100>
          _ <module 'torch' from '/~/anaconda3/envs/pytorch13/lib/python3.8/site-packages/torch/__init__.py'>

TypeError: expected Tensor as element 0 in argument 0, but got int

ZENGXH commented 1 year ago

Hi @PPGGG can you provide more log? like the line above the emd = torch.cat(emd_lst) ? what the command are you running?

OswaldoBornemann commented 1 year ago

@ZENGXH I just ran bash ./script/train_prior.sh 1.

ZENGXH commented 1 year ago

It seems the emd_batch is a int instead of tensor: this is wired, the value is also much larger than I expected.

are you training on the ShapeNet dataset?

Could you try to add this line:

    print(f'emd_batch: {emd_batch}; sample_batch: {sample_batch.shape} {ref_batch.shape}; batch_size={batch_size}; value range: {sample_batch.min()} {sample_batch.max()}; {ref_batch.min()} {ref_batch.max()}; ')

before line 213 of the utils/evaluation_metrics_fast.py files (link). And paste the output here?

For example. in my case I get this following output

2023-04-01 16:10:31.379 | INFO     | trainers.base_trainer:set_writer:57 -
----------
[url]: none
../exp/0401/car/6e880fh_train_lion_B10
----------
2023-04-01 16:10:31.384 | INFO     | __main__:main:68 - not find any checkpoint: ../exp/0401/car/6e880fh_train_lion_B10/checkpoints, (exist=False), or snapshot ../exp/0401/car/6e880fh_train_lion_B10/checkpoints/snapshot, (exist=False)
2023-04-01 16:10:31.384 | INFO     | trainers.base_trainer:train_epochs:173 - [rank=0] Start epoch: 0 End epoch: 18000, batch-size=10 | Niter/epo=245 | log freq=245, viz freq 49000, val freq -10000
2023-04-01 16:10:35.574 | INFO     | utils.exp_helper:get_evalname:94 - git hash: 0ee19
2023-04-01 16:10:35.876 | INFO     | trainers.base_trainer:eval_nll:744 - eval: 1/36
2023-04-01 16:10:40.085 | INFO     | trainers.base_trainer:eval_nll:744 - eval: 31/36
emd_batch: tensor([7.3357e-05], device='cuda:0'); sample_batch: torch.Size([1, 2048, 3]) torch.Size([1, 2048, 3]); batch_size=1; value range: -0.4135565459728241 0.46055838465690613; -0.4139295518398285 0.46082451939582825;

nv-tlabs / LION

TypeError: expected Tensor as element 0 in argument 0, but got int #34