facebookresearch / hanabi_SAD

Simplified Action Decoder for Deep Multi-Agent Reinforcement Learning
Other
97 stars 35 forks source link

Illegal Move Error #20

Closed YiqiJ closed 3 years ago

YiqiJ commented 3 years ago

Hi! I am able to successfully build it. The only modification I did is using pybind11 version @44105ca instead of @a1b71df. However, when I run dev.sh, I am running into

beginning of epoch:  0
available: 47.829 GB, used: 13.490 GB, free: 36.116 GB
EPOCH: 0
Speed: train: 1881.2, act: 8829.9, buffer_add: 677.2, buffer_size: 18432
Total Time: 0H 00M 27S, 27s
Total Sample: train: 51.2K, act: 240.32K
@@@Time
        sync and updating : 0 MS, 1.11%
        sample data       : 0 MS, 0.35%
        forward & backward: 60 MS, 89.45%
        update model      : 6 MS, 9.02%
        updating priority : 0 MS, 0.07%
@@@total time per iter: 68.00 ms
[0] Time spent = 27.22 s
0:grad_norm  [ 400]: avg:  53.6956, min:   0.1063[ 159], max: 21240.2188[ 242]
0:loss       [ 400]: avg:   0.7057, min:   0.3199[ 332], max:   3.2675[  28]
0:rl_loss    [ 400]: avg:   0.5711, min:   0.3402[ 396], max:   1.2437[   0]
epoch 0, eval score: 0.7420, perfect: 0.00, model saved: True
==========
beginning of epoch:  1
available: 43.467 GB, used: 17.852 GB, free: 31.755 GB
Error: move is not legal
UID: 1
legal move:
numStep: 2
legal_move: 5
legal_move: 6
legal_move: 7
legal_move: 8
legal_move: 9
legal_move: 11
legal_move: 12
legal_move: 13
legal_move: 14
legal_move: 17
legal_move: 18
legal_move: 19
python: /home/jyq/hanabi_modify_py/cpp/hanabi_env.cc:79: virtual std::tuple<std::unordered_map<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, at::Tensor, std::hash<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<const std::basic_string<char, std::char_traits<char>, std::allocator<char> >, at::Tensor> > >, float, bool> HanabiEnv::step(const TensorDict&): Assertion `false' failed.
Error: move is not legal
Error: move is not legal
Aborted (core dumped)

The Illegal move error is triggered by optim.step() and the assertion error is raised in hanabi_env.cc. Any idea on why this is happening?

Thanks a lot for your time and help!

Sincerely,

Yiqi

coreylowman commented 3 years ago

Also getting this error

hengyuan-hu commented 3 years ago

@YiqiJ Why did you change to @44105ca instead of the one suggested in the repo? What's the pytorch version you are using?

coreylowman commented 3 years ago

I fixed this by replacing

action = (greedy_action * (1 - rand) + random_action * rand).detach().long()

with

action = torch.where(rand < eps, random_action, greedy_action).detach()

on this line https://github.com/facebookresearch/hanabi_SAD/blob/master/pyhanabi/r2d2.py#L277

Not quite sure why the original line was causing me issues... so weird.

hengyuan-hu commented 3 years ago

Ah, now I remember. Are you using a newer version of pytorch? I ran into similar problems when I tried to upgrade to 1.7.0. Changing all the tensor of this line to float seemed to work for me back then. I don't know the fundamental reason for this. Seems to be a pytorch issue.

coreylowman commented 3 years ago

Yeah I was using 1.8.

I wonder if it has something to do with negative values for the long dtype? maybe newer versions parse it as -rand + 1 and fail at making a negative long value