Closed HYDesmondLiu closed 2 years ago
Hi Desmond, thank you for trying Beobench and sharing the error!
It is hard to tell exactly why you get this error, my first guess would be that the your docker system may not quite be fully setup. Did you follow the Linux post-installation steps described here: https://beobench.readthedocs.io/en/latest/guides/installation_linux.html?
From that page, there would be two potential ways that might fix your docker permission issue:
- Always use sudo in front of beobench commands to grant the relevant privileges required for docker (note that this has not been tested)
- Recommended: follow the official post-installation steps to manage docker as a non-root user to enable running docker without sudo. As the linked documentation points out, this carries a certain security risk.
Did you try either of them?
Let me know if that helps or if you have any other questions/problems!
EDIT: I also just noticed that the output seems to indicate that you set the environment name config to be MixedUseFanFCU-v0
, but the gym framework is set to sinergym. MixedUseFanFCU-v0
is an energym environment, I recommend changing the environment to a sinergym one like Eplus-5Zone-hot-continuous-v1
. So for example having in your config.yaml
file (Note that this issue is separate from the bug you shared):
env:
# gym framework from which we want use an environment
gym: sinergym
# gym-specific environment configuration
config:
# sinergym environment name
name: Eplus-5Zone-hot-continuous-v1
# whether to normalise observations
normalize: True
@rdnfn Thank you for the prompt reply. Previously indeed I was not in the docker user group. However, after being added to the group. I got another error.
And, I use the same agent.py
and config.yaml
as the examples provided in the readme page. The env. name is Eplus-5Zone-hot-continuous-v1
. I am not sure where that environment is set.
@HYDesmondLiu Glad to hear that the first error is fixed.
And you're absolutely right, the wrong env name in the log is not your fault but rather a bug in the logging code in Beobench (wouldn't affect your experiment itself though). Sorry about the confusion. I fixed it already in the dev/general
branch and that fix will be added in the next release.
About the docker unkown shorthand flag
error: I am having a difficult time to reproduce it. Would you be able to share your docker version? You can find out with the command docker --version
. Thanks!
@rdnfn
Thanks for the prompt reply.
My docker version is listed below:
Docker version 19.03.12, build 48a66213fe
Thanks for your patience and sharing the version! Good news, I have been able to reproduce the error now and I also hopefully have been able to fix it. Please update to the latest (new) Beobench version (v0.5.2) using pip install beobench --upgrade
. Let me know if this update resolved your problem!
Background: The problem appears to be that your version of docker (v19.03) does not support the docker buildx
subcommand of docker that Beobench (v0.5.1) uses by default to build experiment container images. This problem is not very easily visible because for some reason it gets hidden behind this unknown flag
error (see output below). As buildx is not required for all use-cases, I have disabled it now where possible (if not on ARM64 architecture).
For future reference, output from the tests I ran:
/ # docker --version
Docker version 19.03.15, build 99e3ed8
/ # docker buildx build
docker: 'buildx' is not a docker command.
See 'docker --help'
/ # docker buildx build -t
unknown shorthand flag: 't' in -t
See 'docker --help'.
Usage: docker [OPTIONS] COMMAND
A self-sufficient runtime for containers
Options:
--config string Location of client config files (default "/root/.docker")
<more help output>
Run inside the following test container docker run -it --entrypoint /bin/sh docker:19.03-dind
Hi @rdnfn,
Thanks for the quick fix, after upgrading to beobench v0.5.2
, while running the same command beobench run --config config.yaml
. I have got another error as shown in the snapshot below, something problems related to pip install.
Hi @rdnfn, Thanks for the quick fix, after upgrading to
beobench v0.5.2
, while running the same commandbeobench run --config config.yaml
. I have got another error as shown in the snapshot below, something problems related to pip install.
Can confirm, same problem here.
@david-woelfle and @HYDesmondLiu thanks for raising this problem! I am working on it ... will release a fix soon.
Thanks again @HYDesmondLiu and @david-woelfle for finding this error! Thanks for your patience!
I have now updated the development version of Beobench with a fix for this issue. You can install the latest development version using the command:
pip install git+https://github.com/rdnfn/beobench.git@dev/general
If you could try to install this and let me know if this resolves your error on your machines? If that's the case, I will publish a new version (v0.5.3) with this fix. Thanks so much for your help making Beobench better!
Background:
I used a form of conditional statement inside the Dockerfile, but this appears to have broken when removing the use of buildx
in v0.5.2. Thus, I moved this logic directly into Python in the dev version.
I have also "yanked" v0.5.2 on pypi (meaning marking it as faulty). With this, right now, new users should not run into this issue anymore as v0.5.1 is considered the latest version again.
Note: The reason the fix took me so long is because I ran into another unrelated bug in the GitHub CI related to the use of Beobench inside docker-in-docker containers. I think this is unlikely, but if you're using this kind of setup (dind) then have look at #85.
EDIT: There was a problem with the internal Beobench version checking due to marking the pypi package as yanked, apologies for that! This is now fixed.
Thanks again @HYDesmondLiu and @david-woelfle for finding this error! Thanks for your patience!
I have now updated the development version of Beobench with a fix for this issue. You can install the latest development version using the command:
pip install git+https://github.com/rdnfn/beobench.git@dev/general
Hi @rdnfn and thank you for working on this issue. I have tried the command above but it didn't work. What did work was installing beobench in version 0.5.1.
Hi @rdnfn, Thanks for the quick fix. this version works for me. However, are you planning to add more content to advanced usage page? I have some question about advanced usage, for example:
If I use pdb to debug it shows a similar error as the original bug reported.
Please let me know if I should open a new issue or we could discuss this issue here, thanks.
Thanks for testing this @HYDesmondLiu! Glad to hear this works now.
About your other points:
However, are you planning to add more content to advanced usage page?
Absolutely, this is work in progress!
- I tried to invoke my own algorithm in the agent script, however, it cannot recognize modules in the same directory.
This is unfortunately not supported yet, but I am hoping to add this feature soon. #70 tracks the mounting of the directory of the agent script (e.g. access to files in the same folder as the agent script and its subfolders). #71 tracks the installation of agent script specific external pypi dependencies (e.g. pip install packagename).
- Also, the first yaml configuration in the page with Energym does not work. If I run the yaml, it shows error messages as this:
Thanks for flagging this. This error is likely because of an env.config setting. Would you be able to open another issue for this problem with a copy of the exact YAML file that you used?
With that I will close this specific issue now, the changes will be included in the next release (v0.5.3). Feel free to open a new issues for any other problems or questions you run into!
@rdnfn Thanks for the detailed responses. I am glad these are considered in recent revision soon. Look forward to the revision.
First, thank you for sharing the codes. I was trying to run examples provided in the readme section.
beobench run --config config.yaml
using provided config.yaml and agent.pyHowever, I got this
docker.errors.DockerException: Error while fetching server API version: ('Connection aborted.', PermissionError(13, 'Permission denied'))
error.Please let me know if any further information is needed to debug.
Environments: Python 3.8.5 Ubuntu 18.04.3 LTS
Detailed message: