training - Githubissues

mafda commented 5 years ago

Hi,

I'm trying to reproduce this work.

I successfully create the dataset, but when try to train appears the next error:

Training Begins:
--------------------
 * EPOCH : 0
  0%|                                                                                                                           | 0/83 [00:00<?, ?it/s]/lib/python3.6/site-packages/torch/nn/functional.py:1320: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead.
  warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.")

Traceback (most recent call last):
  File "train.py", line 155, in <module>
    train_loop_handler.loop(epochs, output_path, load_best_checkpoint=start_from_last, save_all_checkpoints=False)
  File "/src/pytorch_toolbox/pytorch_toolbox/train_loop.py", line 273, in loop
    train_loss = self.train()
  File "/src/pytorch_toolbox/pytorch_toolbox/train_loop.py", line 158, in train
    callback.batch_(self.training_state)
  File "/src/pytorch_toolbox/pytorch_toolbox/loop_callback_base.py", line 31, in batch_
    self.batch(state)
TypeError: batch() missing 2 required positional arguments: 'network_inputs' and 'targets'

It looks like a pytorch version issue. Or something like that. Currently, I have installed the Pytorch version 1.0.1, which is the one in the requirements.txt file in pytorch_toolbox project.

Any suggestion to overpass this problem?

MathGaron commented 5 years ago

Hey,

Yes I did change the pytorch_toolbox framework recently. Try to install version 0.1. I will fix the doc, thanks!

mafda commented 5 years ago

v0.1 doesn't work!

Traceback (most recent call last):
  File "train.py", line 152, in <module>
    use_tensorboard=use_tensorboard, tensorboard_log_path=tensorboard_path)
TypeError: __init__() got an unexpected keyword argument 'use_tensorboard'

commenting out the inexistent tensorboard arguments. Throws another error:

Traceback (most recent call last):
  File "train.py", line 151, in <module>
    train_loop_handler = TrainLoop(model, train_loader, val_loader, optimizer, backend, gradient_clip ) #,
TypeError: __init__() takes 6 positional arguments but 7 were given

The TrainLoop initializer looks like there is not gradient_clip argument. Removing the gradient_clip arguments, produces one error more:

Traceback (most recent call last):
  File "train.py", line 155, in <module>
    train_loop_handler.loop(epochs, output_path, load_best_checkpoint=start_from_last, save_all_checkpoints=False)
  File "/src/pytorch_toolbox/pytorch_toolbox/train_loop.py", line 234, in loop
    self.train()
  File "/src/pytorch_toolbox/pytorch_toolbox/train_loop.py", line 130, in train
    losses.update(loss.data[0], data[0].size(0))
IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

Reverting those changes, and trying with v1.0.0. Training doesn't work neither! Another error appears:

Traceback (most recent call last):
  File "train.py", line 155, in <module>
    train_loop_handler.loop(epochs, output_path, load_best_checkpoint=start_from_last, save_all_checkpoints=False)
  File "/src/pytorch_toolbox/pytorch_toolbox/train_loop.py", line 251, in loop
    train_loss = self.train(epoch + 1)
  File "/src/pytorch_toolbox/pytorch_toolbox/train_loop.py", line 147, in train
    callback.batch(y_pred, data, target, is_train=True, tensorboard_logger=self.tensorboard_logger)
  File "/src/6DOF_tracking_evaluation/ulaval_6dof_object_tracking/deeptrack/deeptrack_callback.py", line 19, in batch
    dof_loss = F.mse_loss(Variable(prediction), Variable(targets[0])).data[0]
IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

Is there another version that I could try?

MathGaron commented 5 years ago

Wow I will try to update that code for the current version of pytorch toolbox.... The problem is that I was using pytorch 0.4ish at that time. I am not sure when I will have time to do it though. In the worst case I will do it after CVPR. Meanwhile, you can either dowgrade pytorch, or fix those error by using .item() I think It should work after.

mafda commented 5 years ago

Thank you for your reply!

It seems to work with pytorch=0.4.1 and pytorch_toolbox=1.0.0. I'll make a complete test to see if everything works as expected!

I'll be waiting for the code update to current version of your toolbox!

MathGaron commented 5 years ago

Thanks for the feedbacks, it is really appreciated! I will notice here once it is fixed.

lvsn / 6DOF_tracking_evaluation

training #8