-
There are several ways agents might want to use as much time as possible to improve their performance (training neural networks/other machine learning, Monte Carlo tree search, ...). I read that a rea…
-
Trying to debug larger width environments (7 currently).
Things to try:
1. Different metric (Average Q-value from 2015 paper https://arxiv.org/pdf/1312.5602.pdf).
```
5.1 Training and Sta…
-
-
input :
lets imagine a self hosted,modifying, improving, expanding system based on gnu/linux and erlang and elixir and using langchainex to improve itself and using coq to prove the improvements are c…
-
Assuming that the current state is s0, what should I do when I only want to get the latter states s1 s2 and don't push the agent to these states in fact? I think that this operation is simlilar to the…
-
-
### Члан тима:
Реља Радека SV40/2020 4. група
### Асистент
Марко Његомир
### Проблем који се решава
Потребно је направити AI за друштвену игру го помоћу Монте Карло алгоритма који ће играт пр…
-
I have constructed a deck full of dinosaurs with Enrage abilities (you get some benefit when the dinosaur takes damage) and some spells/abilities that do small amount of damage (usually 1) to either a…
-
Eventually, the display portion will show the same level of recommendation (ie same colour) in every legal move spot on the board.
In this diagram, you can see that all the explored moves so far hav…
-
## Članovi tima
| Ime i prezime | Broj indeksa | Grupa |
| --------------- | -------------- | ------- |
| Nedeljko Vignjević | SW-59-2018 | 4 |
| Dalibor Malić | SW-50-2018 | 4 |
## Asiste…