ZhuiyiTechnology / roformer-sim

SimBERT升级版(SimBERTv2)!
Apache License 2.0
436 stars 73 forks source link

如何使用自己的数据集进行数据增强? #36

Open Josoope opened 10 months ago

Josoope commented 10 months ago

data_path = './glue/data/' datasets_1 = [] for task_name in ['ATEC', 'BQ', 'LCQMC', 'PAWSX', 'STS-B', 'SOHU21-SSB']: for f in ['train', 'valid']: threshold = 2.5 if task_name == 'STS-B' else 0.5 filename = '%s%s/%s.%s.data' % (data_path, task_name, task_name, f) datasets_1 += load_data_1(filename, threshold)

有点没看懂如何读取自己的数据集,另外数据格式必须是 (文本1, 文本2, 标签) 这样的吗?