abaheti95 / LoL-RL

Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients
23 stars 7 forks source link

No module named 'utils.data_utils' #2

Open popoala opened 5 months ago

popoala commented 5 months ago

While running:

python lolrl_qlora_llama_hh.py --sampling_strategy good_priority

logs with error msg like below:

[2024-03-19 18:59:01,658] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect) Traceback (most recent call last): File "path/to/LoL-RL/lolrl_qlora_llama_hh.py", line 28, in from utils.rl_utils import ValueHeadMLP, ValueHeadAttention, numba_choice File "path/to/LoL-RL/utils/rl_utils.py", line 917, in from utils.data_utils import get_all_thresholds_for_metric ModuleNotFoundError: No module named 'utils.data_utils'

The project doesn't seem to have a function named get_all_thresholds_for_metric. Do you know how to fix it?

abaheti95 commented 5 months ago

Hey @popoala , Thank you for bringing this issue to notice. That part of the code is from another project and is not required for running LoL-RL. I updated the repo. Please let me know if you still find more issues running the code. I will fix it as soon as possible.