-
This code implements a tic-tac-toe (Tic-Tac-Toe) game in which two agents play against each other. One of the agents follows a machine learning approach called Q-learning to improve its moves over tim…
-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Feature Description
The project aims to develop a reinforcement learning (RL) agent to optimize waste collecti…
-
### Expected behavior
When the script is executed, the SX operation should be applied to the qubit at index 1, as specified in the circuit definition. The QASM string generated from the circuit shoul…
-
-
https://geektutu.com/post/tensorflow2-gym-q-learning.html
TensorFlow 2.0 入门系列文章,第七篇,Q-Learning 玩转 OpenAI gym game MountainCar-v0。
-
I see that you are using a 0 vector for the rewards, and only updating the value that corresponds to the action here:
https://github.com/AxiomaticUncertainty/Deep-Q-Learning-for-Tic-Tac-Toe/blob/c5c0…
-
Hello,
I recently read your article "Microservice Deployment in Edge Computing based on Deep Q Learning" and I read your open source code. I would like to know how you integrated your code into a Kub…
-
@enricoande
[1] https://github.com/enricoande/reinforcement_learning_examples/blob/95627db2a323535153e711a23f5519ecf7409f38/invertedpendulum/Sarsa/episodeFA.m#L35
It appears that here `phi` cor…
-
### Search before asking
- [X] I searched in the [issues](https://github.com/apache/pulsar/issues) and found nothing similar.
### Read release policy
- [X] I understand that unsupported versions d…
-
Hi Lucas,
I am implementing different algorithms on different net provided in the library, but I want to simulate the network with a fixed timing and compare the rewards function output for differe…