Can't train tutorial on shoes.

DanielVlasic commented 5 years ago

I downloaded the shoe data (https://github.com/RobotLocomotion/pytorch-dense-correspondence/blob/master/config/dense_correspondence/dataset/composite/shoes_all.yaml) and tried going through the tutorial training with it.

I've received a "warning, empty mask b”, followed by “float division by zero” error.

Also, not sure which training config is appropriate for the shoe data.

manuelli commented 5 years ago

Looks like one of the masks might be empty, one of the logs may be corrupted. Could you post the full error message? In general training the shoes can be done in the same way as the caterpillar in the tutorial if you want a class-consistent shoe network.

For some of the code used in the shoe experiments you can take a look at https://github.com/RobotLocomotion/pytorch-dense-correspondence/blob/master/dense_correspondence/experiments/shoes_consistent/training_shoes.ipynb.

DanielVlasic commented 5 years ago

Here are some details.

I executed the tutorial: dense_correspondence/training/training_tutorial.ipynb.

I set the config to: config_filename = os.path.join(utils.getDenseCorrespondenceSourceDir(), 'config', 'dense_correspondence', 'dataset', 'composite', 'shoes_all.yaml')

Here is the full output of the training cell:

training descriptor of dimension 3 using SINGLE_OBJECT_WITHIN_SCENE logging_dir: /home/dvlasic/data/pdc/trained_models/tutorials/shoes_3 Downloading: "https://download.pytorch.org/models/resnet34-333f7ec4.pth" to /home/dvlasic/.cache/torch/checkpoints/resnet34-333f7ec4.pth 100.0% /usr/local/lib/python2.7/dist-packages/torch/nn/functional.py:2622: UserWarning: nn.functional.upsample_bilinear is deprecated. Use nn.functional.interpolate instead. warnings.warn("nn.functional.upsample_bilinear is deprecated. Use nn.functional.interpolate instead.") /home/dvlasic/code/modules/dense_correspondence_manipulation/utils/utils.py:258: RuntimeWarning: invalid value encountered in arccos theta = 2np.arccos(2 np.dot(q,r)2 - 1) /home/dvlasic/code/modules/dense_correspondence_manipulation/utils/utils.py:258: RuntimeWarning: invalid value encountered in arccos theta = 2np.arccos(2 np.dot(q,r)2 - 1)

empty data, continuing

/home/dvlasic/code/modules/dense_correspondence_manipulation/utils/utils.py:258: RuntimeWarning: invalid value encountered in arccos theta = 2np.arccos(2 np.dot(q,r)**2 - 1)

empty data, continuing

warning, empty mask b

ZeroDivisionError Traceback (most recent call last)

in () 5 print "training descriptor of dimension %d" %(d) 6 train = DenseCorrespondenceTraining(dataset=dataset, config=train_config) ----> 7 train.run() 8 print "finished training descriptor of dimension %d" %(d) /home/dvlasic/code/dense_correspondence/training/training.pyc in run(self, loss_current_iteration, use_pretrained) 340 masked_non_matches_a, masked_non_matches_b, 341 background_non_matches_a, background_non_matches_b, --> 342 blind_non_matches_a, blind_non_matches_b) 343 344 /home/dvlasic/code/dense_correspondence/loss_functions/loss_composer.pyc in get_loss(pixelwise_contrastive_loss, match_type, image_a_pred, image_b_pred, matches_a, matches_b, masked_non_matches_a, masked_non_matches_b, background_non_matches_a, background_non_matches_b, blind_non_matches_a, blind_non_matches_b) 31 masked_non_matches_a, masked_non_matches_b, 32 background_non_matches_a, background_non_matches_b, ---> 33 blind_non_matches_a, blind_non_matches_b) 34 35 if (match_type == SpartanDatasetDataType.SINGLE_OBJECT_ACROSS_SCENE).all(): /home/dvlasic/code/dense_correspondence/loss_functions/loss_composer.pyc in get_within_scene_loss(pixelwise_contrastive_loss, image_a_pred, image_b_pred, matches_a, matches_b, masked_non_matches_a, masked_non_matches_b, background_non_matches_a, background_non_matches_b, blind_non_matches_a, blind_non_matches_b) 82 matches_a, matches_b, 83 masked_non_matches_a, masked_non_matches_b, ---> 84 M_descriptor=pcl._config["M_masked"]) 85 86 if pcl._config["use_l2_pixel_loss_on_background_non_matches"]: /home/dvlasic/code/dense_correspondence/loss_functions/pixelwise_contrastive_loss.pyc in get_loss_matched_and_non_matched_with_l2(self, image_a_pred, image_b_pred, matches_a, matches_b, non_matches_a, non_matches_b, M_descriptor, M_pixel, non_match_loss_weight, use_l2_pixel_loss) 83 84 ---> 85 match_loss, _, _ = PCL.match_loss(image_a_pred, image_b_pred, matches_a, matches_b) 86 87 /home/dvlasic/code/dense_correspondence/loss_functions/pixelwise_contrastive_loss.pyc in match_loss(image_a_pred, image_b_pred, matches_a, matches_b) 163 matches_b_descriptors = matches_b_descriptors.unsqueeze(0) 164 --> 165 match_loss = 1.0 / num_matches * (matches_a_descriptors - matches_b_descriptors).pow(2).sum() 166 167 return match_loss, matches_a_descriptors, matches_b_descriptors ZeroDivisionError: float division by zero

peteflorence commented 5 years ago

Thanks, I can fix this

peteflorence commented 4 years ago

Hi Daniel, sorry to be so slow. Does this commit fix your issue? https://github.com/RobotLocomotion/pytorch-dense-correspondence/commit/ad541fca840f8a07c2bd42b08564bd645341faa8 We have fixed this issue in our private branch, I think this should be all you need. Let me know if doesn't work.

peteflorence commented 4 years ago

Also I am working on getting the new code open sourced, should be soon.

RobotLocomotion / pytorch-dense-correspondence

Can't train tutorial on shoes. #204

warning, empty mask b