wouterkool / attention-learn-to-route

Attention based model for learning to solve different routing problems
MIT License
1.04k stars 337 forks source link

Exception in thread pool task: mutex lock failed: Invalid argument #47

Open raihan29s opened 2 years ago

raihan29s commented 2 years ago

**Hello Wouter,

After finishing epochs 0 and 1 and before the validation starts, I got a mutex lock failed: invalid argument error. I have the torch properly installed. What might be a possible cause for it?**

`| 0/10 [00:00<?, ?it/s][E thread_pool.cpp:113] Exception in thread pool task: mutex lock failed: Invalid argument

[E thread_pool.cpp:113] Exception in thread pool task: mutex lock failed: Invalid argument

[E thread_pool.cpp:113] Exception in thread pool task: mutex lock failed: Invalid argument

[E thread_pool.cpp:113] Exception in thread pool task: mutex lock failed: Invalid argument

Traceback (most recent call last):

File "/Users/admin/Desktop/attention-learn-to-route-master/run.py", line 172, in

run(get_options())

File "/Users/admin/Desktop/attention-learn-to-route-master/run.py", line 158, in run

train_epoch(

File "/Users/admin/Desktop/attention-learn-to-route-master/train.py", line 116, in train_epoch

avg_reward = validate(model, val_dataset, opts)

File "/Users/admin/Desktop/attention-learn-to-route-master/train.py", line 22, in validate

cost = rollout(model, dataset, opts)

File "/Users/admin/Desktop/attention-learn-to-route-master/train.py", line 40, in rollout

return torch.cat([

File "/Users/admin/Desktop/attention-learn-to-route-master/train.py", line 41, in

eval_model_bat(bat)

File "/Users/admin/Desktop/attention-learn-to-route-master/train.py", line 37, in eval_model_bat

cost, _ = model(move_to(bat, opts.device))

File "/opt/homebrew/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl

return forward_call(*input, **kwargs)

File "/Users/admin/Desktop/attention-learn-to-route-master/nets/attention_model.py", line 137, in forward

_log_p, pi = self._inner(input, embeddings)

File "/Users/admin/Desktop/attention-learn-to-route-master/nets/attention_model.py", line 252, in _inner

log_p, mask = self._get_log_p(fixed, state)

File "/Users/admin/Desktop/attention-learn-to-route-master/nets/attention_model.py", line 358, in _get_log_p

log_p, glimpse = self._one_to_many_logits(query, glimpse_K, glimpse_V, logit_K, mask)

File "/Users/admin/Desktop/attention-learn-to-route-master/nets/attention_model.py", line 460, in _one_to_many_logits

compatibility = torch.matmul(glimpse_Q, glimpse_K.transpose(-2, -1)) / math.sqrt(glimpse_Q.size(-1))

RuntimeError: value < sizeINTERNAL ASSERT FAILED at "../aten/src/ATen/TensorIterator.cpp":1557, please report a bug to PyTorch.

0%| | 0/10 [00:01<?, ?it/s]`