SarvagyaVaish / FlappyBirdRL

Flappy Bird hack using Reinforcement Learning
http://SarvagyaVaish.github.io/FlappyBirdRL
917 stars 233 forks source link

Comments #1

Open SarvagyaVaish opened 10 years ago

SarvagyaVaish commented 10 years ago

Leave your comments here...

xissy commented 10 years ago

wow, this is amazing. inspired by your practical ML approach.

iandanforth commented 10 years ago

You should get a tapsterbot! https://github.com/hugs/tapsterbot

Aaron1011 commented 10 years ago

This is incredible!

joeyslater commented 10 years ago

That's what's up man.

dend commented 10 years ago

Awesome job, Survy!

halfdan commented 10 years ago

Nice job - please add a proper reference to the source of the pseudo code though. It's clearly taken out of a publication.

Giszmo commented 10 years ago

I never did image analysis but I assume it to be trivial to do with a camera for your android bot. You said the image (screenshot) takes 2s to get to the PC? A cam should be much much faster. The image analysis would basically just need to scan the right side of the screen for green-notgreen-green. The timing is constant.

ztl2004 commented 10 years ago

Dude, this is fantastic and it's what I be thought about for a long time, I ve noticed that u want to do this on mobile, I ve studied ios private Apis and I ve done the screen capture and touch simulation, do u think there is a possibility that we work it out

bolte-17 commented 10 years ago

Any thought to adding either bird's current velocity (or as a proxy, time since last tap) to the state space? That seems to be the only missing parameter.

ztl2004 commented 10 years ago

but I think it's hard to get

在 Feb 16, 2014,2:57 PM,bolte-17 notifications@github.com 写道:

Any thought to adding either bird's current velocity (or as a proxy, time since last tap) to the state space? That seems to be the only missing parameter.

— Reply to this email directly or view it on GitHub.

cbbayburt commented 10 years ago

Actually, simulating the game's dynamics might lead to a simpler and more precise solution. the game doesn't really involve sophisticated decision steps which requires ML. Since it is really a pure physics problem, simpler solution depends on some simple observations though:

flappy

So the algorithm would be:

hb: The bird's jump height for a single tap, in other words, the amplitude of the bird's harmonic motion in level flight (it is constant and can be measured in means of pixels). hBird: Height of the middle point of the bird's harmonic motion. hObstacle: Height of the middle point of the space between the pipes. ptap: Waiting period before the next tap. dh: The height difference between the bird and the obstacle path.

for each immediate uncleared obstacle:
  while(obstacle_not_cleared)
    dh <- hObstacle - hBird
    ptap <- 600 - (600 * dh / hb)
    if ptap < 0 then ptap <- 0  //Gonna fall, tap immediately
    sleep(ptap)
    tap()

This algorithm can make the flappy lips fly forever. For android, instead of requesting .png screenshots which really takes about 1-2 seconds, you can analyze specific pixels in the raw frame buffer (some unix device file like /dev/graphics/fb0) which gives you enough speed to run the algorithm. But for that, you obviously need a rooted device.

SarvagyaVaish commented 10 years ago

Analyzing the "specific pixels in the raw frame buffer" is worth a shot! Thanks! And I agree with the solution being nicer if I simulated the game dynamics, but I wanted to approach the problem using machine learning. Thanks for the solution though.

savraj commented 10 years ago

I'd love a deeper walkthrough of this -- maybe a youtube video.

metaylor commented 10 years ago

This is very cool. Good idea to pick a popular game and show that ML can solve it! I'm going to bring up your project as a discussion topic in the graduate reinforcement learning class I'm currently teaching.

http://www.eecs.wsu.edu/~taylorm/14_580/index.html

SarvagyaVaish commented 10 years ago

That is awesome!! I am honored. Thanks :) May I ask how you found the link?

metaylor commented 10 years ago

My brother pointed me to it. I'm not sure how he found out about it though.

Best, Matt


Matt Taylor http://eecs.wsu.edu/~taylorm

On Sun, Feb 16, 2014 at 7:03 PM, Sarvagya Vaish notifications@github.comwrote:

That is awesome!! I am honored. Thanks :) May I ask how you found the link?

Reply to this email directly or view it on GitHubhttps://github.com/SarvagyaVaish/FlappyBirdRL/issues/1#issuecomment-35225526 .

ztl2004 commented 10 years ago

maybe reddit

在 Feb 17, 2014,11:06 AM,metaylor notifications@github.com 写道:

My brother pointed me to it. I'm not sure how he found out about it though.

Best, Matt


Matt Taylor http://eecs.wsu.edu/~taylorm

On Sun, Feb 16, 2014 at 7:03 PM, Sarvagya Vaish notifications@github.comwrote:

That is awesome!! I am honored. Thanks :) May I ask how you found the link?

Reply to this email directly or view it on GitHubhttps://github.com/SarvagyaVaish/FlappyBirdRL/issues/1#issuecomment-35225526 .

— Reply to this email directly or view it on GitHub.

billhao commented 10 years ago

this is very cool!

ataugeron commented 10 years ago
Get this to work on a mobile phone!! If anyone has any ideas , please let me know in the comments :)

Did you try using monkeyrunner (Python, Android) or UIAutomation (Javascript, iOS)?

SarvagyaVaish commented 10 years ago

monkeyrunner took about 1-2 seconds to get a screenshot.. so not responsive enough. Haven't tried UIAutomation, but do you know if the response time is any better?

cxt120 commented 10 years ago

Is the training only works on a specific map?

SarvagyaVaish commented 10 years ago

There is no "map". There is randomness as far as the pipe height is concerned, but the game is basically just one never ending randomized "map" of pipes coming towards you.

cooperjay commented 10 years ago

i just another working method over here : http://flappybirdhack.hol.es/

thebino commented 10 years ago

How do you want to grap /dev/graphics/fb0 and use the result image for calculating? Do you want to write something like an TestCase with Events injection on the WindowManager?

Eniac-Xie commented 10 years ago

Is Q[s,a] just a large array?Or a function like BP Neural Networks?

SarvagyaVaish commented 10 years ago

Yeah. Q is a multi-dimensional array representing the entire state space.

Eniac-Xie commented 10 years ago

I'm a little curious. I think the bird's speed should also be considered. I mean that birds with the same position but different speed will lead to different result,didn't it?

SarvagyaVaish commented 10 years ago

Based on the game dynamics, the bird always gets the same upward velocity irrespective of its velocity at the time of input. So weirdly enough, two birds at the same position with different speeds will end up at the same position when the user tell them to jump.

Eniac-Xie commented 10 years ago

thank you!

2014-04-24 0:16 GMT+08:00 Sarvagya Vaish notifications@github.com:

Based on the game dynamics, the bird always gets the same upward velocity irrespective of its velocity at the time of input. So weirdly enough, two birds at the same position with different speeds will end up at the same position when the user tell them to jump.

— Reply to this email directly or view it on GitHubhttps://github.com/SarvagyaVaish/FlappyBirdRL/issues/1#issuecomment-41181440 .

Eniac-Xie commented 10 years ago

I try it myself but find Q cannot converge in a short time, maybe my Q is too large(160_401_2. it seems that 160_401_2 is not large). How large is your Q?

SarvagyaVaish commented 10 years ago

It takes about 6-8 hours at regular game speed for flappy to learn a good model.

andreydung commented 10 years ago

How do you run the code? Is it simply running index.html?

SarvagyaVaish commented 10 years ago

Yeah. Just start up a local server (wamp, xampp) and open the index.html

junzhez commented 10 years ago

Just a quick question. Can the vertical distance to pipe bottom be negative?

SarvagyaVaish commented 10 years ago

Yes, if the bird is below the pipe :)

junzhez commented 10 years ago

Thanks for your reply. May I ask you about the dimension of your state space? I am trying to reproducing you work with another copy of Flappy Bird. It seems that my state space is way to large.

SarvagyaVaish commented 10 years ago

I dont remember exactly, but it was huge! Takes a while to train. Check out http://sarvagyavaish.github.io/FlappyBirdRL/ for more details.

tropicdome commented 10 years ago

Nice work, I love application of RL for something some fun like this, kudos :)

I have tried out your implementation but with different resolutions since this can greatly decrease the number of states. Using a resolution of 10 instead of 4 lowered the state space from 12150 states to 1944. Here is my data after running this

One question, does it or should it take the distance to the ground into account? When you get a pipe that is really close to the ground, the bird would sometimes like to go below and then jump, which it obviously can't, but it is not learning from this?

SarvagyaVaish commented 10 years ago

Thanks for crunching the numbers! Its cool to see that the state space affects the learning times so drastically. About the distance from the ground, its true that the model doesn't learn that it should jump when close to the ground. I didn't want to add another dimension to my state space, and that's primarily why i don't take that into account. But for better results, you could probably add a general (non-learned) rule that says that the bird must jump when close to the ground. Another idea would be to add that third dimension of distance to ground but only have two state in it - less than xx units from the ground, more than xx units from the ground. That way you would only be doubling the state space, but can have the system learn the rule anyway :)

SteveRik commented 9 years ago

Very nice blog. Thanks for sharing! Is there any possibility that the vertical distance to pipe bottom be negative? Please advise. Thanks! https://intellipaat.com/

SarvagyaVaish commented 9 years ago

Yes. It is possible and the model accounts for that :)

On Mon, Jan 5, 2015, 06:42 SteveRik notifications@github.com wrote:

Very nice blog. Thanks for sharing! Is there any possibility that the vertical distance to pipe bottom be negative? Please advise. Thanks! https://intellipaat.com/

— Reply to this email directly or view it on GitHub https://github.com/SarvagyaVaish/FlappyBirdRL/issues/1#issuecomment-68697783 .

xoancosmed commented 9 years ago

It is Open Source ?

SarvagyaVaish commented 9 years ago

Yes.

On Mon, Apr 6, 2015, 5:35 AM Xoán Carlos Cosmed Peralejo < notifications@github.com> wrote:

It is Open Source ?

— Reply to this email directly or view it on GitHub https://github.com/SarvagyaVaish/FlappyBirdRL/issues/1#issuecomment-89995736 .

AIForex commented 8 years ago

I'm working on a similar program that would involve the Forex market ,

Actions per bar would be as follows 1 buy open exit close, 2) sell open exit close 3) Do nothing soon to be published on www.marketcheck.co.uk

Peter peterkhenry@gmail.com

Aytros commented 8 years ago

This is great! I recently graduated with a degree in Comp. Sci. My last semester I took Intro to AI and our final project was to implement this on our own and we were provided with a working python flappy bird. My agent was not very effiecient but it did learn a little so I did well. NOw that I am graduated, I would like to iprove my agent for my own sake. Would you be able to look over my algorithm and give some feedback on how I might be able to improve?

paulocastroo commented 6 years ago

6-7 hours is not good at all, made this flappy bird bot training in 3 minutes with random forest, I'll see if I could fit some room of improvement with your code in qlearn

SarvagyaVaish commented 6 years ago

@paulocastroo that's because i was running the flappy bird in realtime using the game engine. If you could speed up the simulation, training would end up being significantly faster. Curious to learn how you used random forest to train. Let me know :) Thanks!

tropicdome commented 6 years ago

For classic Q-learning @SarvagyaVaish implementation is already quite good. Doesn't have to take 6-7 hours, besides the real-time perspective as @SarvagyaVaish mentioned, you could/should optimize your state space representation. For example, change the resolution to e.g. 20 to reduce the state space significantly (which is reasonable) and it will train in <15 minutes running in real-time, or even 30 and it trained for me in 2.5 min.

paulocastroo commented 6 years ago

@SarvagyaVaish oh sorry was not paying attention with the real life. I made some changes with the states, I tried to compress the states as much as possible, it ended as difference/distance between the height of the bird and the pipe hole making the overall matrix much smaller, here's a demo: https://planktonfun.github.io/q-learning-js/step-6.html