Pong project - Githubissues

AnaRhisT94 commented 5 years ago

Have you managed to create a pong project yet?

getnamo commented 5 years ago

The blueprint wrappers and setup is available in this branch: https://github.com/getnamo/tensorflow-ue4-examples/tree/qlearn.

You should be able to play the game, speed up time, switch to if statement ai, switch to qlearning ai and more. I never got around to fully training the network however. Play around with it and see if you can get the ai to a good state.

AnaRhisT94 commented 5 years ago

Yeah, I've currently got the pong game to work on a spyder environment. Hopefully I can trigger the keys of the game with a python script with the picture associated and sent it as an input to the python code. and vise versa. Thought about obtaining the screen frames with some win32api tools. (some grab_screen function)

Anyways, I'm going to try all the tools soon and update when I finished with the pong. Thanks!

getnamo commented 5 years ago

In the example above, instead of learning on pixels of the game it's playing the game in UE4 and sending ball and player locations as inputs. This should drastically reduce the network size needed (more qlearning than deepq).

If you want to train based on pixels while still using ue4 to simulate the environment, consider using scene capture (https://docs.unrealengine.com/en-us/Resources/ContentExamples/Reflections/1_7) to render to texture at desired size and then sending the input through.

AnaRhisT94 commented 5 years ago

Yeah well I've got a task to do a diff game with enemies spawning in random location and going towards you, and the idea it to move the pole up (rotate it up) so it won't bump to the enemy. And thus, sending location of the ball/enemy would probably be insufficient here, I'm not quite sure. I'll use the Scene Capture 2D. Hopefully I'll manage to get something to work eventually..

AnaRhisT94 commented 5 years ago

getnamo, I'm trying to get a crucial thing to work and I'm not sure how to do that, How can I send an Image & a variable's value in the same time to Python?

I want to feed it to a neural network

getnamo commented 5 years ago

https://github.com/getnamo/tensorflow-ue4#any-ustruct-example is the key part. You want to encode your image as an array of floats, append that array along with any variables you want in python as a struct and then encode that as a json which you send to the python layer.

There are helper functions available which convert textures to float arrays https://github.com/getnamo/tensorflow-ue4/blob/master/Source/TensorFlow/Public/TensorFlowBlueprintLibrary.h#L20 these are blueprint accessible from anywhere.

The greyscale version is used for mnist recognition inside the TensorflowComponent (it's a blueprint component you can inspect it).

For reference here's the Send Json Input Texture function

ImageStruct is a blueprint defined struct, you can use it if it fits your needs or make your own.

then in a simple example, only the 'pixels' property is used for e.g. mapping input in mnist https://github.com/getnamo/tensorflow-ue4-examples/blob/master/Content/Scripts/mnistTutorial.py#L26

AnaRhisT94 commented 5 years ago

When loading it with UE4:

Can't open the ue4-examples..

getnamo commented 5 years ago

Read the directions in e.g. https://github.com/getnamo/tensorflow-ue4-examples/releases/tag/0.4.1 release thoroughly, you need to copy the matching plugin into your project root (which is https://github.com/getnamo/tensorflow-ue4/releases/tag/0.10.1).

Added a section in the readme in troubleshooting to help clarify for future users: https://github.com/getnamo/tensorflow-ue4-examples/blob/master/README.md#startup-error

AnaRhisT94 commented 5 years ago

Thanks for the quick answer! Now a new problem again.. maybe it probably happens because I didn't install cudNN, i'll try the cpu version first and update soon!

getnamo commented 5 years ago

Please read the instructions: https://github.com/getnamo/tensorflow-ue4#installation--setup you need to wait until dependencies have installed

AnaRhisT94 commented 5 years ago

Yeah I used the CPU version now and it works, it trains and then predicts successfully. Thanks!

I'm actually able to do almost everything to complete the task. Just one thing left to accomplish. I'll use Python winapi to input keys in the game, and some code to grab the screen. The problem I'm facing is, how can I send a variable's value to a python's variable. This is the most crucial part because I need to send rewards in order for this to work.

And I'm still not when there's for example a collision/overlapping between player & enemy, send a reward of -1, when there's collision with the floor, send a reward +1, and so on..

StockySpark commented 5 years ago

Hello. Thank you for awesome TF plugin for UE4. I'm trying to use it to provide plausible route planning for pedestrain agents. Still searching for the right model, but RL looks promising. How would you train the network in this PongAI example?

getnamo commented 5 years ago

Keep in mind that I'm not machine learning expert and you may need look up some more recent resources for best practices.

A typical reinforcement learning method is Deep Q learning. The pong example uses this here (NB for this particular problem it's an overkill, but meant to serve as an example) https://github.com/getnamo/tensorflow-ue4-examples/blob/qlearn/Content/Scripts/PongAI.py. There are also random input AI and basic if-statement AI scripts found in the same folder to train against and validate performance.

Inside PongAI.py you will notice it has the runJsonInput function which is called from blueprint with the unreal game state as input. For the pong game the state is the ball X and Y position, and paddle position (height) https://github.com/getnamo/tensorflow-ue4-examples/blob/f32b9e27589f6cbdeccafb3cf890fb3d5bb7511e/Content/Scripts/PongAI.py#L72 . This is stacked for last 200 frames so that the the ai can learn some temporal features.

The current reward along with the last action executed is dequeued so that we can correlate the reward with observation (game state). This is sent through a training step https://github.com/getnamo/tensorflow-ue4-examples/blob/f32b9e27589f6cbdeccafb3cf890fb3d5bb7511e/Content/Scripts/PongAI.py#L88 which should adjust the model weights a little bit based on performance. Doing this many times will over time reinforce better action/sequences to maximize rewards as better weights.

We then take the newest observations and request the ai's next action https://github.com/getnamo/tensorflow-ue4-examples/blob/f32b9e27589f6cbdeccafb3cf890fb3d5bb7511e/Content/Scripts/PongAI.py#L94. the selected action is then fed back to blueprint as a float value inside a JSON object https://github.com/getnamo/tensorflow-ue4-examples/blob/f32b9e27589f6cbdeccafb3cf890fb3d5bb7511e/Content/Scripts/PongAI.py#L105 where the game executes it for the next game tick.

That's largely it. Run this loop: tick game, send state, train step on last state and reward, get next action, send action to blueprint to act out and repeat. After a long time it should train if our reward is well selected and if we have chosen a good enough observation structure and other dqn parameters. Note that traditional dqn inputs are pixels like simplified Atari game screens and the network should learn to associate what those pixels mean by itself. In this example I'm doing something much much smaller in state so a dqn is an overkill and likely a regular q learning setup should work.

What remains is to select a good reward system and how to speed up training. It's likely that you can reshape the loop to take in the results from many instances of the game running in parallel (e.g. a bunch of pong games in the same level) and update all their observations, rewards, training steps and next actions at the same time. This should drastically increase the speed of training.

Hopefully that gives you an idea of how this is setup, I'd point you to other general resources such as https://medium.freecodecamp.org/an-introduction-to-deep-q-learning-lets-play-doom-54d02d8017d8 and recommend you adapt the network to your problem following appropriate modern guides to get good results. I never spent the time and got this example to get fully trained so YMMV.

StockySpark commented 5 years ago

Thanks again for the answer. I managed to hook the DQN up to my project. I implemented the sensor and reward. Now I need to start training it. The Save/Load model functionality would be very useful, so I tried to use mnistSaveLoad.py as an example. You can see the full .py script here: https://yadi.sk/d/On6_nRs6WerrtA Unfortunately, I get errors in UE4, when I call this custom function:

def saveModel(self, empty):
    with self.sess.as_default():
        with self.graph.as_default():
            save_path = self.saver.save(self.sess, self.model_path)
            print("Agent's model was saved in file: %s" % save_path)
    pass

It seems that I don't understand how to initialize value W (and others values in DQN->CNN tensorflow). 2019-05-27_00-38-45 Can you, please, help me with DQN Save/Load feature?

getnamo commented 5 years ago

Saving and loading are largely vanilla tensorflow functions, please see e.g. https://www.tensorflow.org/tutorials/keras/save_and_restore_models for guides. See also this related stack overflow question: https://stackoverflow.com/questions/46514138/tensorflow-attempting-to-use-uninitialized-value-w-in-a-class

getnamo / TensorFlow-Unreal-Examples

Pong project #18