Reinforcement Learning for Elden Ring on Windows11.
EldenRL is build on OpenAI's gym
toolkit for reinforcement learning environments. It enables training and running of reinforcement learning models on Elden Ring bosses and PvP duels. EldenRL uses the game as a its environment by capturing the screen and controling the game.
You need to own and install Elden Ring and set to windowed mode 1920x1080 at the top left of your screen. EldenRL requires the player to load into the game before any training is ran. In addition we require the following key bindings: w,a,s,d = movement | shift = sprint / dodge | c = light attack | v = heavy attack | x = off hand (magic) | q = weapon art | e = item | f = interact | esc = menu
.
After the game is installed and set up correctly you can install the code requirenments. Most things are a simple pip installs
but there are some special installation to be aware off: Stable-Baselines3
requires Python 3.9.13 and PyTorch to run. Pytesseract
needs to be downloaded and installed and the path needs to be set in main.py for reading text from images.
This project is built on Windows 11 but should run on older Windows version too.
Apart from the software requirements you will the hardware to run the training. The Project has been tested in CPU mode with the game running normally on the GPU and the training running on the CPU. The tested performance for this is ~2.4fps on a i9 13900k CPU for the training (and the game running normally).
Given you dont want to dive deep into the Code, running this Project should be fairly straight forward.
Download the Repository, install the requiremets and navigate to the main.py
file. Set the config
variables to the correct values you want them to run on then simply run the code. In the console the programm will output its state and inform you about what is going on. Starting a new training session will look something like this:
When the code is Waiting for loading screen...
you will need to start matchmaking for PvP mode or kill the character to trigger a loading screen in PvE mode.
After the loading screen the code will call WalkToBoss.py
and bring the agent to its initial position where the agent can then take over.
Once the agent takes over the console will look something like this:
Now the training session should run without any further input from the user in an endless loop.
(Loading screen - Reset - Training - Death - Loading screen - ...)
To create a new model set CREATE_NEW_MODEL
in main.py to True, or set it to False after a model has been saved at least once. During training the model will be saved automatically to ./models
after 500 steps of the agent making decisions. This took about 20 deaths for a newly created model training in PvE. Saving will look like this:
Note that a model will perform random actions when its newly created and will only update its behaviour when it is saved.
Logs for Tensorboard logging can be found in ./logs
.
In this section we'll go over the code structure and functionality of the Project.
main.py
This is the main file that can be run to start the codebase. Most user settings can be found here to make it as simple as possible.
train.py
This is the center pice of the whole project. This is where the decision making and training actually happens. It should feel very familiar if youve worked on a Reinforcement Learning project before. For this we closely folloed OpenAI's gym
structure.
EldenEnv.py
This is our Environment. Here we interact with the game (capturing the screen, performing actions and passing the observation/reward back to train.py).
EldenReward.py
This is where we calculate the reward based on the observation. The total reward for every step is passed back to EldenEnv.py.
WalkToBoss.py
This is our reset functionality. In PvE it is called to walk the character from the bonfire to the boss, in PvP it handles matchmaking and player lock-on.
All of our observation are derived from screen capturing the game with CV2. This means the agent can only use information based on that, no reading of game states or reading memory.
The agent has the following information in the observation space:
For sucessfull screen capture the game needs to be in Windowed mode 1920x1080 at the top left of your screen. We use the default position when the game is truned to window mode, this means there is a small gap to the left border of your monitor. You can set DEBUG_MODE
to true to nail the correct position.
The rewards are also all derived from the same screen capture used in the observation. The agent is rewarded/punished for the following things:
PvE:
PvP:
Have a look at the EldenRL showcase to better understand this project.
EldenRL PvE showcase: https://www.youtube.com/watch?v=NzTwDO4ehPY
EldenRL PvP showcase: https://www.youtube.com/watch?v=2Uh1T8FE0y8
This project is fully open source. It was only possible with the help of multiple contributers laying the groundwork for this project. Feel free to use this code in any way you want. If you want to contribute to this project feel free to fork it and let us know on the Discord server.
The next steps would probably be to train a modle for longer than we have and measuring some results. Then adding some new rewards or improving the observations could be a logical next step. Its also possible to swap out reinforcement learning modles with stable_baselines_3 and comparing the results between them.
Huge shoutout to Jameszampa and Lockwo for doing a lot of the ground work. We couldnt have done it without them.
If you are interested in this bot you may also like to check out their projects.