save_dataset (bool): Whether or not save filtered dataset. If True, save filtered dataset, otherwise it will not be saved.
Defaults to False.
dataset_save_path (str): The path of saved dataset. The tool will attempt to load the dataset from this path. If it equals to None, the tool will try to load the dataset from {checkpoint_dir}/{dataset}-{dataset_class_name}.pth. If the config of saved dataset is not equal to current config, the tool will create dataset from scratch. Defaults to None.
save_dataloaders (bool): Whether or not save split dataloaders. If True, save split dataloaders, otherwise they will not be saved. Defaults to False.
dataloaders_save_path (str): The path of saved dataloaders. The tool will attempt to load the dataloaders from this path. If it equals to None, the tool will try to load the dataloaders from {checkpoint_dir}/{dataset}-for-{model}-dataloader.pth. If the config of saved dataloaders is not equal to current config, the tool will create dataloaders from scratch. Defaults to None.
@SunYuMs 您好,RecBole 提供了参数
save_dataset (bool)
和save_dataloaders (bool)
,分别用于选择是否存储过滤好的数据集与划分好的数据加载器。但是,RecBole 提供的存储功能并不能将数据划分后的数据集导出成
.inter
格式的文件,而是导出为一整个.pth
文件。save_dataset (bool)
: Whether or not save filtered dataset. If True, save filtered dataset, otherwise it will not be saved. Defaults toFalse
.dataset_save_path (str)
: The path of saved dataset. The tool will attempt to load the dataset from this path. If it equals toNone
, the tool will try to load the dataset from{checkpoint_dir}/{dataset}-{dataset_class_name}.pth
. If the config of saved dataset is not equal to current config, the tool will create dataset from scratch. Defaults toNone
.save_dataloaders (bool)
: Whether or not save split dataloaders. If True, save split dataloaders, otherwise they will not be saved. Defaults toFalse
.dataloaders_save_path (str)
: The path of saved dataloaders. The tool will attempt to load the dataloaders from this path. If it equals toNone
, the tool will try to load the dataloaders from{checkpoint_dir}/{dataset}-for-{model}-dataloader.pth
. If the config of saved dataloaders is not equal to current config, the tool will create dataloaders from scratch. Defaults toNone
.在数据处理的过程中,函数
_remap_ID_all
会将原数据集中的外部 token 映射为内部的 ID,也就是说自带的存储功能得到的数据集是预处理之后的结果,和原始的.inter
文件并不相同,train_data
,valid_data
和test_data
三个数据加载器的结果也不太直观。RecBole 目前的接口并不支持将数据集导出为三个.inter
格式的文件,还需要使用者自行添加代码来实现。感谢您对 RecBole 项目的关注!