facebookresearch / hanabi_SAD

Simplified Action Decoder for Deep Multi-Agent Reinforcement Learning
Other
97 stars 35 forks source link

Import issue when trying to evaluate trained model #13

Closed akileshbadrinaaraayanan closed 4 years ago

akileshbadrinaaraayanan commented 4 years ago

Hi @hengyuan-hu , Thanks for modifying the build process. I followed the new set of instructions and I am able to build it successfully creating hanalearn.cpython-37m-x86_64-linux-gnu.so inside build directory and rela.cpython-37m-x86_64-linux-gnu.so inside build/rela directory.

However, when I try to evaluate trained models using the instructions provided i.e python tools/eval_model.py --weight ../models/sad_2p_10.pthw --num_player 2, I get an import error. I think this has got to do with setting PYTHONPATH so that the code finds the .so. I tried doing a sys.path.append the .so file paths but still get the same error. It would be great if you could provide a fix for this. Thanks a lot in advance!


ImportError: generic_type: type "HanabiThreadLoop" referenced unknown base type "rela::ThreadLoop"```
akileshbadrinaaraayanan commented 4 years ago

Hi @hengyuan-hu , did you get a chance to look at this?

hengyuan-hu commented 4 years ago

I cannot reproduce this specific problem. Normally to fix the issue you experienced, we need to import rela before import hanalearn since the second one depends on (inherits from) the first one. For the same reason, we need to import torch before import rela. However, this should all be taken care of here https://github.com/facebookresearch/hanabi_SAD/blob/59f9263eea593d4ada276b81ac4eaeed08822267/pyhanabi/create.py#L17

However, I do found some issue with the evaluation script and updated it. Try if this version works for you.

zoeyuchao commented 4 years ago

hi, i also encounter this issue when i try to train the model. Any methods to solve it now? thx.

jdpena commented 4 years ago

I am receiving the exact same error. I suspect it is an issue with pybind11. Should this issue be reopened or create a new one?

edit 1: may be related to https://github.com/pybind/pybind11/issues/633

edit 2: turns out i had the same issue as https://github.com/facebookresearch/hanabi_SAD/issues/8#issuecomment-618176473. Switching the pybind11 repo to the a1b71df137 commit hash and uninstalling a previous version via pip uninstall solved my issue.

hengyuan-hu commented 4 years ago

re: edit 1. That is another way (and maybe a better way) to organize the pybind files. But the code in this repo should work fine if pybind modules are included in the correct order.

re: edit 2. The pybind submodule in this repo should be set to a1b71df137 automatically when you clone it. Was it at a different commit when you cloned? I wonder which dependency will cause the installation of another version of pybind. Did you start with a fresh conda environment?