Open MozerWang opened 1 year ago
不需要同时new两个model,可以new一个model,然后再加载两次不同的参数。
不需要同时new两个model,可以new一个model,然后再加载两次不同的参数。
ok,以上只是我举了一个简单例子,我实际遇到的问题是:在循环中,迭代训练模型时就会报错!以下是一个伪代码,以说明模型运行情况
a = rocketqa.load(teacher model)
for iter_number in range(0,3):
newdataset = inference(a, unlabelset)
model.config = model.train(a, newdataset)
a = rocketqa.load(model.config)
evaluate(a)
在循环外(迭代外)加载teacher模型,然后进迭代(获取伪标签->训练新模型->加载新模型->评估-->获取伪标签),在第一次循环时(进行第一次训练),不报错。但是到第二次循环,就会报错:
File "/u01/miniconda3/envs/bankqa/lib/python3.8/site-packages/paddle/fluid/layers/io.py", line 440, in _py_reader
feed_queue = core.init_lod_tensor_blocking_queue(var, capacity, False)
RuntimeError: (AlreadyExists) LoDTensorBlockingQueueHolder::InitOnce() can only be called once
[Hint: Expected queue_ == nullptr, but received queue_ != nullptr.] (at /paddle/paddle/fluid/operators/reader/lod_tensor_blocking_queue.h:207)
下面是我的真实代码,,思路跟上面伪代码是一样的:
#加载模型
dual_encoder = load_retriever_model(de_model,device_id,batch_size)
cross_encoder = load_retriever_model(ce_model,device_id,batch_size)
#评估teacher模型
logging.info(f"Evaluating base zero-shot model:{de_model} performance on test set")
prediction = get_zero_shot_predictions(dual_encoder, cross_encoder, data_file=data_file,
index_file='testindex', topk=20, input_data=test_data)
evaluation = evaluate_retriever_performance(prediction)
for iter_number in range(1, num_iterations+1):
#在无标签数据上进行推理
logging.info(f"Inferring with {de_model} on unlabeled elements:{unlabel_data_file})")
prediction = get_zero_shot_predictions(dual_encoder, cross_encoder, data_file=unlabel_data_file,
index_file=f'{iter_number}_unlabelindex', topk=100, input_data=train_data)
logging.info("Done inferring zero-shot model on unlabeled elements")
#获得伪标签
self_training_set = get_selftraining_dataset(prediction, unlabel_data_file, data_path, iter_number)
logging.info(f"Done collecting pseudo-labeled elements for self-training iteration {iter_number}"
f"The pseudo-labeled texts are saving in {self_training_set}")
#基于伪标签进行训练
# We use the updated pseudo-labeled set from this iteration to fine-tune the *base* entailment model
logging.info(f"Fine-tuning model:{de_model} on pseudo-labeled texts")
finetuned_model_path = finetune_entailment_model(dual_encoder, self_training_set, iter_number,
learning_rate=1e-5, save_steps=5000, num_epochs=20)
logging.info(f"Done fine-tuning. Model for self-training iteration {iter_number} "
f"saved to {finetuned_model_path}.")
#加载并评估训练好的模型
de_model = os.path.join(finetuned_model_path,"config.json")
dual_encoder = load_retriever_model(de_model,device_id,batch_size)
logging.info(f'iteration {iter_number}: evaluating model {de_model} performance on test set')
test_preds = get_zero_shot_predictions(dual_encoder, cross_encoder, data_file=data_file,
index_file=f'{iter_number}_testindex', topk=20, input_data=test_data)
evaluation = evaluate_retriever_performance(test_preds)
bug描述 Describe the Bug
问题:我利用rocketqa做self_training,因此在迭代过程中,需要多次load同一模型。先去做inference获得伪标签,再去利用伪标签做finetuning,根据我写的封装逻辑,这个过程要对同一模型加载两次,但paddle框架应该不支持这样操作,因此会报错 代码如下:
报错如下
其他补充信息 Additional Supplementary Information
No response