-
**Describe the bug**
再进行多机lora微调时出错:
failed (exitcode: -11) local_rank: 5 (pid: 11514) of binary: /home/jovyan/data-ws-enr/zconda/envs/swift_ft/bin/python
Traceback (most recent call last):
File…
-
Got an InvalidArgumentError after 26 minutes of training. I upgraded to the most recent TensorFlow as suggested and did `$ pip install -U 'gym[all]' tqdm scipy`. I ran this on a TitanX and Ubuntu 16.1…
-
I want to train the robot in very few steps and very quickly in terms of wall time but I haven't completed a training run on the robot yet. I should do that first to sanity check, make sure there is n…
-
AGENT NAME: A3C
1.1: A3C
TITLE CartPole
layer info [20, 10, [2, 1]]
layer info [20, 10, [2, 1]]
{'learning_rate': 0.005, 'linear_hidden_units': [20, 10], 'final_layer_activation': ['SOFTMAX', …
-
I have been playing around with the DCBTrainer and found some potential inconsistencies.
1) **StatlogData** example found [here](https://genrl.readthedocs.io/en/latest/usage/tutorials/bandit/contex…
-
Hi,
Taking the "battlefield" for example, how can I control the red team only?
Then how can I change the number of the agents?
Thanks!
-
Hey Jiri,
I wonder if you could give some guidance on how to use keras-rl in order to create your own "gym" environment.
For example, I see that your board_gym.py is based on core.py, but what for…
-
I currently have the problem that, a lot of times, the results Optuna optimization produces are not really too optimal, due to the stochastic nature of RL training. For example, training 3 agents with…
-
I wanted to mess around with imitation learning on a simple lane following expert. Based on the README, I thought this would be easy to test out. But I had to edit several parts of the code, like to d…
-
## Hypothesis
Authors of recently published research Alpha Zero stated that this technique could be easily generalised to other problems without significant human effort and it approached better th…