Closed DanielS684 closed 4 years ago
Thank you for your kind words and your suggestions!
These are fantastic ideas and it would indeed be very interesting to have a never ending world as you suggest. We do not plan on releasing any new live server and/or competition at the moment, but we'll definitely keep your suggestions in mind.
To come back to the continual learning idea, I think of continual learning as the opportunity for an agent to learn while on the task (rather than only doing inference to solve said task). It would make sense to just leave more time for every agent to complete a task so that they can spend time learning on top of inferring. The idea is for an agent to see the same task 5 times for example.
Now that the competition is over, we really hope the community (you 🙃) will take on this project and and generate great ideas such as yours. Therefore, I can only encourage you to fork the repo and make it your own, try implementing some of these ideas! We will aim for a new competition in the future which will be a step further than the one we had last year, in the meantime let us know if you make progress and/or have questions.
I am really interested in the work that is happening here and would first like to give kudos on creating this benchmark. I just wanted to list some ideas I had that I thought could make this project cooler.
1) The first idea I thought of was what if you guys made this a sort of live event that people can send their agents to. The way I thought this would work is by having a live server that people could send 1-2 agents so they could navigate to different parts of a map with different tasks that give them some form of rewards. I believe that this could help build an interactive community through the fact that people could see their creations compete against other creations for the same food and continually become smarter, especially if it is a continual learning algorithm as I mention in the second point.
1.5) (Also just saying you could do this with one agent and do self-play but it would be more interesting to see algorithms compete with each other in real-time instead of just being on a static benchmark. Plus this could provide new ways to measure how good an agent is by how fast it adapts to opponents new strategies, how long were they number 1, etc.)
2) Another idea came from one of the lectures that you guys had on improvements that you would like to make to the system and heard that you guys would like to implement some form of a continual learning system. I thought that you guys could turn that into a sort of day/night cycle thing in the sense that during the night the agent can train itself on general things it has learnt throughout the day, whilst during the day it uses that general knowledge to inform itself on current actions plus being able to still adapt.
3) Finally, I think you could put higher (but fewer) rewards in the more mentally tasking areas whilst putting lower (but more) rewards, over a wider range so one agent can't just collect everything before anyone gets there, in the less mentally tasking areas. This could help incentivize the agents to learn harder and harder cognitive tasks because after saturation of plans have been explored then the only way to get more reward would be to try something harder, trick/delay opponents from getting a reward, or cooperating with other opponents.
So yeah these are my 2 cents on the project and I am waiting for future developments. Please leave a comment on any ideas that should be added, any improvements on these ideas, or anything else.