PaddlePaddle / PaddleHub

Awesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)【安全加固,暂停交互,请耐心等待】
https://www.paddlepaddle.org.cn/hub
Apache License 2.0
12.72k stars 2.08k forks source link

每次运行程序结果不一致 #1138

Closed changchend closed 3 years ago

changchend commented 3 years ago

hub,1.8.1 之前有保存ckpt_ernie_pointwise_matching,, text_pairs = [["这家餐厅很好吃", "这部电影真的很差劲"]]

print(pointwise_matching_task.predict( data=text_pairs, max_seq_len=128, label_list=dataset.get_labels(), return_result=True, load_best_model=True, accelerate_mode=False)) 运行所有程序,不单单 这个预测接口,,预测结果会改变,概率会变

changchend commented 3 years ago

(['0'], [[0.5246255993843079, 0.47537437081336975]]) (['0'], [[0.6127048134803772, 0.387295126914978]]) (['1'], [[0.26533350348472595, 0.7346665263175964]])

changchend commented 3 years ago

[2020-12-23 16:08:12,333] [ INFO] - Installing chinese-bert-wwm-ext module [2020-12-23 16:08:12,450] [ INFO] - Module chinese-bert-wwm-ext already installed in /root/.paddlehub/modules/chinese_bert_wwm_ext W1223 16:08:17.359491 18447 device_context.cc:252] Please NOTE: device: 0, CUDA Capability: 61, Driver API Version: 10.2, Runtime API Version: 10.0 W1223 16:08:17.365764 18447 device_context.cc:260] device: 0, cuDNN Version: 7.6. [2020-12-23 16:08:49,074] [ INFO] - Checkpoint dir: ckpt_ernie_pointwise_matching [2020-12-23 16:08:49,571] [ INFO] - PaddleHub predict start [2020-12-23 16:08:49,571] [ INFO] - Load the best model from ckpt_ernie_pointwise_matching/best_model /home/software/Anaconda/anaconda3/envs/qy/lib/python3.7/site-packages/paddle/fluid/executor.py:1093: UserWarning: There are no operators in the program to be executed. If you pass Program manually, please use fluid.program_guard to ensure the current Program is being used. warnings.warn(error_info) [2020-12-23 16:08:51,065] [ INFO] - Try loading checkpoint from ckpt_ernie_pointwise_matching/ckpt.meta [2020-12-23 16:08:51,065] [ INFO] - PaddleHub model checkpoint not found, start from scratch... [2020-12-23 16:08:51,533] [ INFO] - PaddleHub predict finished. (['1'], [[0.26533350348472595, 0.7346665263175964]])

changchend commented 3 years ago

import paddlehub as hub from paddlehub.dataset.base_nlp_dataset import TextMatchingDataset

class COVID19Competition(TextMatchingDataset): def init(self, tokenizer=None, max_seq_len=None): base_path = 'COVID19_sim_competition' super(COVID19Competition, self).init( is_pair_wise=False, # 文本匹配类型,是否为pairwise base_path=base_path, train_file="/data/qy/hub/COVID19_sim_competition/train.txt", # 相对于base_path的文件路径 dev_file="/data/qy/hub/COVID19_sim_competition/dev.txt", # 相对于base_path的文件路径 train_file_with_header=True, dev_file_with_header=True, label_list=["0", "1"], tokenizer=tokenizer, max_seq_len=max_seq_len)

module = hub.Module(name="chinese-bert-wwm-ext")

Pointwise任务需要: query, title_left (2 slots)

inputs, outputs, program = module.context(trainable=True, max_seq_len=128, num_slots=2)

tokenizer = hub.BertTokenizer(vocab_file=module.get_vocab_path(), tokenize_chinese_chars=True)

dataset = COVID19Competition(tokenizer=tokenizer, max_seq_len=128)

strategy = hub.L2SPFinetuneStrategy( learning_rate=5e-5, optimizer_name="adam", regularization_coeff=1e-3)

config = hub.RunConfig( log_interval= 1000, eval_interval=3000, use_cuda=True, num_epoch=1, batch_size=32, checkpoint_dir='ckpt_ernie_pointwise_matching', strategy=strategy)

构建迁移网络,使用ERNIE的token-level输出

query = outputs["sequence_output"] title = outputs['sequence_output_2']

创建pointwise文本匹配任务

pointwise_matching_task = hub.PointwiseTextMatchingTask( dataset=dataset, query_feature=query, title_feature=title, tokenizer=tokenizer, config=config, metrics_choices=['f1'],)

预测数据样例

text_pairs = [["这家餐厅很好吃", "这部电影真的很差劲"]]

print(pointwise_matching_task.predict( data=text_pairs, max_seq_len=128, label_list=dataset.get_labels(), return_result=True, load_best_model=True, accelerate_mode=False))

KPatr1ck commented 3 years ago

[2020-12-23 16:08:12,333] [ INFO] - Installing chinese-bert-wwm-ext module [2020-12-23 16:08:12,450] [ INFO] - Module chinese-bert-wwm-ext already installed in /root/.paddlehub/modules/chinese_bert_wwm_ext W1223 16:08:17.359491 18447 device_context.cc:252] Please NOTE: device: 0, CUDA Capability: 61, Driver API Version: 10.2, Runtime API Version: 10.0 W1223 16:08:17.365764 18447 device_context.cc:260] device: 0, cuDNN Version: 7.6. [2020-12-23 16:08:49,074] [ INFO] - Checkpoint dir: ckpt_ernie_pointwise_matching [2020-12-23 16:08:49,571] [ INFO] - PaddleHub predict start [2020-12-23 16:08:49,571] [ INFO] - Load the best model from ckpt_ernie_pointwise_matching/best_model /home/software/Anaconda/anaconda3/envs/qy/lib/python3.7/site-packages/paddle/fluid/executor.py:1093: UserWarning: There are no operators in the program to be executed. If you pass Program manually, please use fluid.program_guard to ensure the current Program is being used. warnings.warn(error_info) [2020-12-23 16:08:51,065] [ INFO] - Try loading checkpoint from ckpt_ernie_pointwise_matching/ckpt.meta [2020-12-23 16:08:51,065] [ INFO] - PaddleHub model checkpoint not found, start from scratch... [2020-12-23 16:08:51,533] [ INFO] - PaddleHub predict finished. (['1'], [[0.26533350348472595, 0.7346665263175964]])

[2020-12-23 16:08:51,065] [ INFO] - Try loading checkpoint from ckpt_ernie_pointwise_matching/ckpt.meta
[2020-12-23 16:08:51,065] [ INFO] - PaddleHub model checkpoint not found, start from scratch...

这里提示了模型没有成功加载,所以你加载的模型,下游的网络参数是随机初始化的 建议查一下checkpoint的路径是否填写正确。

changchend commented 3 years ago

[2020-12-23 16:08:12,333] [INFO]-安装chinese-bert-wwm-ext模块 [2020-12-23 16:08:12,450] [INFO]-已经安装了chinese-bert-wwm-ext模块安装在/root/.paddlehub/modules/chinese_bert_wwm_ext W1223 16:08:17.359491 18447 device_context.cc:252]请注意:设备:0,CUDA功能:61,驱动程序API版本:10.2,运行时API版本:10.0 W1223 16: 08:17.365764 18447 device_context.cc:260]设备:0,cuDNN版本:7.6。 [2020-12-23 16:08:49,074] [信息]-检查点目录:ckpt_ernie_pointwise_matching [2020-12-23 16:08:49,571] [信息]-PaddleHub预测开始时间 [2020-12-23 16:08:49,571 ] [信息]-从ckpt_ernie_pointwise_matching / best_model加载最佳模型 /home/software/Anaconda/anaconda3/envs/qy/lib/python3.7/site-packages/paddle/fluid/executor.py:1093:用户警告:程序中没有要执行的运算符。如果您手动传递程序,请使用fluid.program_guard来确保正在使用当前程序。 warnings.warn(error_info) [2020-12-23 16:08:51,065] [信息]-尝试从ckpt_ernie_pointwise_matching / ckpt.meta [2020-12-23 16:08:51,065]加载检查点[信息]-PaddleHub模型检查点找不到,从头开始... [2020-12-23 16:08:51,533] [信息]-PaddleHub预测完成。 (['1'],[[0.26533350348472595,0.7346665263175964]])

[2020-12-23 16:08:51,065] [ INFO] - Try loading checkpoint from ckpt_ernie_pointwise_matching/ckpt.meta
[2020-12-23 16:08:51,065] [ INFO] - PaddleHub model checkpoint not found, start from scratch...

这里提示了模型没有成功加载,所以你加载的模型,下游的网络参数是随机初始化的 建议查一下checkpoint的路径是否正确填写。

ok,thank you