hand10ryo / PyTorchCML

PyTorchCML is a library of PyTorch implementations of matrix factorization (MF) and collaborative metric learning (CML), algorithms used in recommendation systems and data mining.
MIT License
20 stars 2 forks source link

TwoStageSampler is giving Simplex() error #39

Open shivamtundele opened 2 years ago

shivamtundele commented 2 years ago

Following error is happening no matter if I am using pos or neg weights

epoch1 avg_loss:1.206:   0%|          | 1/256 [00:00<00:57,  4.46it/s]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-16-da8e4fbdcbb5> in <module>
     16 #sampler = samplers.BaseSampler(train_set = cml_train_set, neg_weight = neg_weight, n_user = n_user, n_item = n_item, device=device, strict_negative=True)
     17 trainer = trainers.BaseTrainer(cml_model_all_in, optimizer, criterion, sampler)
---> 18 trainer.fit(n_batch=256, n_epoch=10)

~/.local/lib/python3.7/site-packages/PyTorchCML/trainers/BaseTrainer.py in fit(self, n_batch, n_epoch, valid_evaluator, valid_per_epoch)
     75                         self.sampler.set_candidates_weight(dist, self.model.n_dim)
     76 
---> 77                     neg_items = self.sampler.get_neg_batch(users.reshape(-1))
     78 
     79                     # initialize gradient

~/.local/lib/python3.7/site-packages/PyTorchCML/samplers/TwoStageSampler.py in get_neg_batch(self, users)
    121             weight = self.candidates_weight
    122 
--> 123         neg_sampler = Categorical(probs=weight)
    124         neg_indices = neg_sampler.sample([self.n_neg_samples]).T
    125         neg_items = self.candidates[neg_indices]

~/site-packages/torch/distributions/categorical.py in __init__(self, probs, logits, validate_args)
     62         self._num_events = self._param.size()[-1]
     63         batch_shape = self._param.size()[:-1] if self._param.ndimension() > 1 else torch.Size()
---> 64         super(Categorical, self).__init__(batch_shape, validate_args=validate_args)
     65 
     66     def expand(self, batch_shape, _instance=None):

~/site-packages/torch/distributions/distribution.py in __init__(self, batch_shape, event_shape, validate_args)
     54                 if not valid.all():
     55                     raise ValueError(
---> 56                         f"Expected parameter {param} "
     57                         f"({type(value).__name__} of shape {tuple(value.shape)}) "
     58                         f"of distribution {repr(self)} "

ValueError: Expected parameter probs (Tensor of shape (256, 200)) of distribution Categorical(probs: torch.Size([256, 200])) to satisfy the constraint Simplex(), but found invalid values:
tensor([[7.2129e-03, 7.2939e-03, 0.0000e+00,  ..., 6.3333e-03, 0.0000e+00,
         0.0000e+00],
        [9.9370e-03, 0.0000e+00, 1.6256e-02,  ..., 5.3079e-03, 9.3441e-03,
         0.0000e+00],
        [9.9370e-03, 0.0000e+00, 1.6256e-02,  ..., 5.3079e-03, 9.3441e-03,
         0.0000e+00],
        ...,
        [0.0000e+00, 0.0000e+00, 0.0000e+00,  ..., 4.5067e-03, 4.2130e-03,
         1.3499e-02],
        [1.3212e-28, 0.0000e+00, 0.0000e+00,  ..., 0.0000e+00, 1.4386e-28,
         1.3498e-28],
        [0.0000e+00, 6.2382e-03, 7.7719e-03,  ..., 0.0000e+00, 0.0000e+00,
         1.5519e-02]], grad_fn=<DivBackward0>)
hand10ryo commented 2 years ago

Thanks for the report. Does the error also occur when you don't set the weights ? Could you tell me how to reproduce the error on my end if you know ?

shivamtundele commented 2 years ago

I think this is an ongoing issue with the current implementation of PyTorch. There should be better error or warning to understand this. I am not sure how to reproduce this on a public dataset. I tried but I couldnt.

hand10ryo commented 2 years ago

I see. Let's keep the Issue open until we get some knowledge.