Closed fipeop closed 2 years ago
Hi FELIPE,
Thanks for your interest in our work~
But I'm sorry that I cannot reproduce your error. Just to confirm, did you follow the commands in the requirements section for the install? i.e., this part:
NOTE: this implementation requires an old version of PyTorch (v1.0.0). You may want to start a new conda environment to run our code. The step-by-step guide is as follows (using torch-cpu for an example):
conda create --name mesa python=3.7.11
conda activate mesa
conda install pytorch-cpu==1.0.0 torchvision-cpu==0.2.1 cpuonly -c pytorch
pip install -r requirements.txt
These commands should help you to get ready for running mesa. If you have any further questions, please feel free to open an issue or drop me an email.
I just did a fresh install with these commands and the code seems to work as expected. Please try following this guide and see if it solves your problem.
PS: The meta-sampler used in MESA is not a large network. Its size depends only on the dimensionality of the meta-state (usually < 20), rather than the amount of data. So the advantage of pytorch-GPU is likely to be insignificant.
Yes, I followed those commands --- did you test on a Unix environment? Wondering if those steps only work on Windows, maybe?
Yes, my primary development environment is Windows, so I only tested these steps on it. Maybe there are some magic inconsistencies between anaconda on windows and Unix-based OSs that caused this problem?
But I'm sorry that I'm currently busy applying for a Ph.D., so I'm afraid that there is no time for me to fix this. You can see that the meta_sampler
is just a Soft Actor-Critic network (see here), which is defined in this folder. The main classes are defined in sac.py and model.py, with only less than 300 lines of code in total. I believe the problem is most likely in these 300 lines of code in SAC. In the future, I may replace the SAC implementation based on a more modern version of Pytorch.
If you find a solution to this error, I would greatly appreciate a PR!
Thanks again for your interest~
Hi,
Thanks for the great work. I tried installing the dependencies as in explained in the last version of the ReadMe file and I got:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [50, 1]], which is output 0 of AsStridedBackward0, is at version 3; expected version 2 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
Alternatively, when installing the same version of pytorch 1.0.0 with GPU support I got this different issue: https://discuss.pytorch.org/t/undefined-symbol-cblas-sgemm-alloc/32497
Is there any other way to build the dependencies?