PaddlePaddle / RocketQA

🚀 RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models.
Apache License 2.0
766 stars 128 forks source link

如何重复加载同一个模型?或者如何释放这个模型然后再次加载? #95

Closed MozerWang closed 1 year ago

MozerWang commented 1 year ago

rocketqa非常好用,感谢团队的付出!! 我利用rocketqa做self_training,因此在迭代过程中,需要多次load同一模型。先去做inference获得伪标签,再去利用伪标签做finetuning,根据我写的封装逻辑,这个过程要对同一模型加载两次,但paddle框架应该不支持这样操作,因此会报错:

Traceback (most recent call last):
  File "/u01/bankQA/self_training/test_rkqa.py", line 297, in <module>
    cross_encoder = rocketqa.load_model(**ce_conf)
  File "/u01/miniconda3/envs/bankqa/lib/python3.8/site-packages/rocketqa/rocketqa.py", line 122, in load_model
    encoder = CrossEncoder(**encoder_conf)
  File "/u01/miniconda3/envs/bankqa/lib/python3.8/site-packages/rocketqa/encoder/cross_encoder.py", line 90, in __init__
    self.test_pyreader, self.graph_vars = create_predict_model(
  File "/u01/miniconda3/envs/bankqa/lib/python3.8/site-packages/rocketqa/model/cross_encoder_predict.py", line 39, in create_predict_model
    pyreader = fluid.layers.py_reader(
  File "/u01/miniconda3/envs/bankqa/lib/python3.8/site-packages/paddle/fluid/layers/io.py", line 723, in py_reader
    return _py_reader(
  File "/u01/miniconda3/envs/bankqa/lib/python3.8/site-packages/paddle/fluid/layers/io.py", line 440, in _py_reader
    feed_queue = core.init_lod_tensor_blocking_queue(var, capacity, False)
RuntimeError: (AlreadyExists) LoDTensorBlockingQueueHolder::InitOnce() can only be called once
  [Hint: Expected queue_ == nullptr, but received queue_ != nullptr.] (at /paddle/paddle/fluid/operators/reader/lod_tensor_blocking_queue.h:207)

下面我举一个简单的代码例子,运行起来的话,最后一句就会报错。

ce_model = “zh_dureader_ce_v2”
ce_conf = {
        "model": ce_model,
        "use_cuda": True,
        "device_id": 0,
        "batch_size": 32
    }
 cross_encoder = rocketqa.load_model(**ce_conf)
 cross_encoder = rocketqa.load_model(**ce_conf)

请问如何重复加载同一个模型呢?如何释放这个模型然后再次加载?

procedure2012 commented 1 year ago

现在的rocketQA没有考虑到会有重复加载同一个模型的情况,如果反复加载会因为部分组件重名导致加载失败。如果想反复加载可能需要手动更改内部的代码

MozerWang commented 1 year ago

现在的rocketQA没有考虑到会有重复加载同一个模型的情况,如果反复加载会因为部分组件重名导致加载失败。如果想反复加载可能需要手动更改内部的代码

谢谢回复,我看了看fluid代码,改起来比较麻烦,后面我把循环方案写到shell里面了,通过多次启动python程序实现这样的功能。

Duanexiao commented 1 year ago

为啥close这个issue,这个应该是一个需求