-
- **Model-Free vs Model-Based RL**
>1)**Model-based** algorithmis an algorithm that uses _the transition function_ (and _the reward function_) in order to estimate the optimal policy.
> The age…
-
Can someone check [shaders_mesh_instancing](https://www.raylib.com/examples/shaders/loader.html?name=shaders_mesh_instancing) example works from zig? I have the following code based on the example abo…
-
When loading the Models with Mesh the ownership is not clear and the mash(es)(?) got double freed.
Logs
```bash
INFO: TEXTURE: [ID 2] Texture loaded successfully (128x128 | GRAY_ALPHA | 1 mipmaps…
-
I would like to ask for your advice on the following two questions.
1. DPO train does not seem to support DeepSpeed ZeRO. After manually integrating `DPOAlignerArguments` with the `FinetunerArguments…
-
I'm encountering a CUDA out of memory error while running reinforcement learning (RL) using the mol2mol_similarity model on a SageMaker **ml.g5.xlarge** instance (**24GB GPU memory**). The RL process …
-
Package architecture:
- controllers:
> class control: PID, pure pursuit, bang-bang, open-loop(velocity profile), ...
> optimal control: lqr, ddp, mpc, ...
> collision avoidance: RVO, O…
-
I have seen that https://gradio.app/ is used in the UIs for Hugging Face. @wang-boyu have you looked into it, since it is listed in one of the possible frameworks to use in the GSoC wiki?
See also ht…
-
I was trying to reproduce the experiment and test the various LLMs. I tried both Chatgpt-4 and chatgpt-3.5, and set the game difficulty to 2. I run the game for 3 to 4 times and LLM lost all of them. …
-
I used the following two commands to identify broken links. `markdown-link-check` is https://github.com/tcort/markdown-link-check
``` bash
find ./Practical_RL/ -type f -name '*.ipynb' -exec jupyt…
-
Hi,
First of all, thanks to your nice work.
I was trying to run your go engine on my server, which has about 120 GiB memory.
It went all right until I tried to train with provided dataset.
…