reward-models Search Results

1000+ results
for reward-models

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

QwenLM/Qwen2 #685

Potential use cases for Qwen-0.5B

What are some of the intended use cases for the 0.5B model. There are not a lot of other similar sized models and neither is there a lot of hype around them. Though general audience seems to love th…

Tejaswgupta updated 3 weeks ago
2
RLHFlow/Online-RLHF #4

Fail to load weight from pair-preference-model-LLaMA3-8B

Hi, congratulations to the great work and thanks for open source! I am running step 3.2 with pair-preference-model-LLaMA3-8B. However, I encountered the warning "Some weights of LlamaForSequenceCl…

matouk98 updated 2 weeks ago
2
ExplainableML/ReNO #4

SD-2.1

Very excellent job, if you migrate him to 50-step SD-2-1, can you work well?

zhou431496 updated 2 weeks ago
3
huggingface/trl #1783

Clarification on reward/value heads in PPOV2

First, thank you for your efforts in helping to bring accurate and performant RLHF techniques to the open-source community. I'm raising this issue hoping to get some clarification on a couple implem…

SalmanMohammadi updated 2 weeks ago
3
OpenRLHF/OpenRLHF #295

QLORA model loading error

Hi team getting the following error while enabling 4-bit and LORA ``` File "/root/miniconda3/envs/open/lib/python3.11/site-packages/deepspeed/runtime/engine.py", line 262, in __init__ self._c…

karthik-nexusflow updated 1 month ago
5
bagisto/bagisto-reward-points #1

confilct between cart model in bagisto-reward-points and car…

Declaration of Webkul\Rewards\Models\Cart::items() must be compatible with Webkul\Checkout\Models\Cart::items(): Illuminate\Database\Eloquent\Relations\HasMany in /var/www/vhosts/hqol.store/httpdocs/v…

Hossam1Hamed updated 5 months ago
3
DongChen06/MARL_CAVs #41

Regarding the running errors of run_madqn

Thank you for your open-source materials. I have also tried to successfully run the run_mappo and run_maacktr models, but encountered an error while running the run_madqn model: **self. memory. push (…

danke93 updated 1 month ago
2
OpenLMLab/MOSS-RLHF #24

关于中文reward-model参数合并的问题

感谢作者无私开源，看到官方README里说中文的reward-model是基于open-chinese-llama-7b做的，但是后面的步骤说明里写的是：python merge_weight_zh.py recover --path_raw decapoda-research/llama-7b-hf --path_diff ./models/moss-rlhf-reward-model-7B-z…

hannlp updated 4 months ago
4
nicklashansen/tdmpc2 #23

RuntimeError When Loading State_Dict for Single-task Models

I am trying to run the model that was downloaded from [huggingface](https://huggingface.co/nicklashansen/tdmpc2/tree/main/dmcontrol) using the following command: ``` python evaluate.py task=humanoid…

Zzl35 updated 3 months ago
5
OptimalScale/LMFlow #861

[BUG] The text cannot be generated successfully during the R…

**Describe the bug** When I use the fine-tuned LLAMA3 model to run the `examples/raft_align.py` script, I encountered the following error: ``` Traceback (most recent call last): File "/home/work…

biaoliu-kiritsugu updated 3 weeks ago
1

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for reward-models

1000+ results
for reward-models