Closed zhang-haojie closed 3 months ago
Great question! This is just a difference between the pre-trained model (which is trained on a mix of different datasets) and finetuned models (which is trained on a single dataset).
For models trained on a mix of datasets (e.g. in pretraining), the make_interleaved_dataset
returns the statistics for each dataset separately in a dict (so you must index like model.dataset_statistics[dataset_name]['action']
), but make_single_dataset (used in finetuning) directly returns the dataset statistics (so you access by calling model.dataset_statistics['action']
).
If you finetune a model, and inspect its dataset_statistics.json
, you'll see the appropriate structure. Sorry for the confusion!
Thank you for your timely answer to my question. It is already a very good code design!
When I run
03_eval_finetuned.py
, in lines 78,octomodel.dataset_statistics
as a dictionary will index['action']
,model.dataset_statistics["action"]
. But when I checkdataset_statistics.json
in checkpoints,action
is an attribute of a specific dataset. Is there any error in the code here?