Open xinghuang2050 opened 5 months ago
Great work! I commented all the push_to_hub in the code. Is synthetic_data_llama-3-8b-instruct-sppo-iter3_score dataset generated by PairRM?
Great work! I commented all the push_to_hub in the code. Is synthetic_data_llama-3-8b-instruct-sppo-iter3_score dataset generated by PairRM?
Hi,
This file should appear in your local folder (under where you started the script) if the generation pipeline has run successfully. Please check for any errors in the generation process.
Yes. It is generated by vllm and PairRM and is automatically included in our pipeline.
Great work!
I commented all the push_to_hub in the code. Is synthetic_data_llama-3-8b-instruct-sppo-iter3_score dataset generated by PairRM?
rank4: Traceback (most recent call last): rank4: File "/training-data/huangxing/software/SPPO/sppo/run_dpo.py", line 249, in
rank4: File "/training-data/huangxing/software/SPPO/sppo/run_dpo.py", line 43, in main rank4: main_inner(model_args, data_args, training_args) rank4: File "/training-data/huangxing/software/SPPO/sppo/run_dpo.py", line 78, in main_inner rank4: raw_datasets = get_datasets(data_args, splits=["train"])
rank4: File "/training-data/huangxing/software/SPPO/sppo/alignment/data.py", line 164, in get_datasets rank4: raw_datasets = mix_datasets(dataset_mixer, splits=splits, shuffle=shuffle)
rank4: File "/training-data/huangxing/software/SPPO/sppo/alignment/data.py", line 189, in mix_datasets rank4: dataset = load_dataset(ds, split=split)
rank4: File "/training-data/software/miniconda3/envs/mcts/lib/python3.11/site-packages/datasets/load.py", line 2129, in load_dataset rank4: builder_instance = load_dataset_builder(
rank4: File "/training-data/software/miniconda3/envs/mcts/lib/python3.11/site-packages/datasets/load.py", line 1815, in load_dataset_builder rank4: dataset_module = dataset_module_factory(
rank4: File "/training-data/software/miniconda3/envs/mcts/lib/python3.11/site-packages/datasets/load.py", line 1512, in dataset_module_factory rank4: raise e1 from None rank4: File "/training-data/software/miniconda3/envs/mcts/lib/python3.11/site-packages/datasets/load.py", line 1468, in dataset_module_factory rank4: raise ConnectionError(f"Couldn't reach '{path}' on the Hub ({type(e).name})") rank4: ConnectionError: Couldn't reach 'synthetic_data_llama-3-8b-instruct-sppo-iter3_score' on the Hub (ConnectionError)