-
Hello, I am trying to pre-train the actor model but around the 815-816th example, the training stops and shows this very long error message. I had already trained the reward model so I have been using…
-
Hi,
Followed the instal as per for Windows and it runs fine without "--load_8bit=True"
Trying to get it to run with "--load_8bit=True" following the extra instructions as:-
pip uninstall bits…
-
**Is your feature request related to a problem? Please describe.**
Defining a reward function may be complex or just impossible in some cases (ie: an agent making a back-flip or a natural walk) or, i…
-
- Behavioral Strategy Determines Frontal or Posterior Location of Short-Term Memory in Neocortex
- https://pubmed.ncbi.nlm.nih.gov/30100254/
- Mixture of Learning Strategies Underlies Rodent Beha…
-
[Going to try this again](https://github.com/Microsoft/msbuild/issues/16), but hopefully with a better defined ask. The request is not to support a particular format, but to improve the MSBuild syste…
-
Trying to run stack-llama [rl_training script](https://github.com/lvwerra/trl/blob/main/examples/stack_llama/scripts/rl_training.py) with the following [reward model](https://huggingface.co/kashif/lla…
-
### System Info
- `transformers` version: 4.21.2
- Platform: Linux-5.10.135-122.509.amzn2.x86_64-x86_64-with-glibc2.2.5
- Python version: 3.8.5
- Huggingface_hub version: 0.10.0
- PyTorch versi…
-
```
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 22.20 GiB total capacity; 20.67 GiB already allocated; 4.12 MiB free; 21.14 GiB reserved in total by PyTorch) I…
-
### 1. What is your project? (max 100 words)
(Our project is called FlowModel during Chainlink Spring 2022 Hackathon, which is renamed as BlockModel now.)
BlockModel is a R&D infrastructure …
-
### 🐛 Describe the bug
相关日志:
WARNING:torch.distributed.run:
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid you…