Open dogydev opened 5 years ago
I have successfully passed the data through the model, though I got an output of tensors from the inference in trainer.py. How should I decode these tensors into a string that contains the answer? Thanks.
could you please tell me how to pass the data through the model? I mean,after I trained the model with file "cmrc2018_train.json"and "cmrc2018_dev.json" and saved the model in a folder.But I don't know how to pass my article and my question into the model and get the correct answer(the question is given by the user and the article is from the internet,and I want to get more concrete answer from the article).
To pass the question and context into the model, use the inference function in bert_coqa.py. Code: from sogou_mrc.dataset.coqa import CoQAReader, CoQAEvaluator from sogou_mrc.libraries.BertWrapper import BertDataHelper from sogou_mrc.data.batch_generator import BatchGenerator from sogou_mrc.data.vocabulary import Vocabulary from sogou_mrc.model.bert_coqa import BertCoQA import logging import sys
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
coqa_reader = CoQAReader(-1) data_folder = 'data/' eval_filename = 'input.json' vocab = Vocabulary(do_lowercase=True)
eval_data = coqa_reader.read(data_folder + eval_filename, 'dev') print(eval_data)
vocab.build_vocab(eval_data)
evaluator = CoQAEvaluator(data_folder + eval_filename) bert_dir = 'uncased' bert_data_helper = BertDataHelper(bert_dir)
eval_data = bert_data_helper.convert(eval_data, data='coqa')
eval_batch_generator = BatchGenerator(vocab, eval_data, training=False, batch_size=1, additional_fields=['input_ids', 'segment_ids', 'input_mask', 'start_position', 'end_position', 'question_mask', 'rationale_mask', 'yes_mask', 'extractive_mask', 'no_mask', 'unk_mask', 'qid'])
model = BertCoQA(bert_dir=bert_dir, answer_verification=True)
del eval_data, vocab out = model.evaluate(eval_batch_generator, evaluator) print(out)
Input.json: { "version": "1.0", "data": [ { "source": "mctest", "id": "3dr23u6we5exclen4th8uq9rb42tel", "filename": "mc160.test.41", "story": "Once upon a time, in a barn near a farm house, there lived a little white kitten named Cotton. Cotton lived high up in a nice warm place above the barn where all of the farmer's horses slept. But Cotton wasn't alone in her little home above the barn, oh no. She shared her hay bed with her mommy and 5 other sisters. All of her sisters were cute and fluffy, like Cotton. But she was the only white one in the bunch. The rest of her sisters were all orange with beautiful white tiger stripes like Cotton's mommy. Being different made Cotton quite sad. She often wished she looked like the rest of her family. So one day, when Cotton found a can of the old farmer's orange paint, she used it to paint herself like them. When her mommy and sisters found her they started laughing. \n\n\"What are you doing, Cotton?!\" \n\n\"I only wanted to be more like you\". \n\nCotton's mommy rubbed her face on Cotton's and said \"Oh Cotton, but your fur is so pretty and special, like you. We would never want you to be any other way\". And with that, Cotton's mommy picked her up and dropped her into a big bucket of water. When Cotton came out she was herself again. Her sisters licked her face until Cotton's fur was all all dry. \n\n\"Don't ever do that again, Cotton!\" they all cried. \"Next time you might mess up that pretty white fur of yours and we wouldn't want that!\" \n\nThen Cotton thought, \"I change my mind. I like being special\".", "questions": [ { "input_text": "What is the polynomial theorem?", "turn_id": 1 } ], "answers": [ { "span_start": 59, "span_end": 93, "span_text": "a little white kitten named Cotton", "input_text": "white", "turn_id": 1 } ] }]}
The answer comes out as tensors and I am asking how to decode them.
thank you very much.
------------------ 原始邮件 ------------------ 发件人: "dogydev"notifications@github.com; 发送时间: 2019年7月12日(星期五) 晚上7:27 收件人: "sogou/SMRCToolkit"SMRCToolkit@noreply.github.com; 抄送: "thoucsin"1598244350@qq.com; "Comment"comment@noreply.github.com; 主题: Re: [sogou/SMRCToolkit] Input text and Question and get Answer (#27)
To pass the question and context into the model, use the inference function in bert_coqa.py. Code: from sogou_mrc.dataset.coqa import CoQAReader, CoQAEvaluator from sogou_mrc.libraries.BertWrapper import BertDataHelper from sogou_mrc.data.batch_generator import BatchGenerator from sogou_mrc.data.vocabulary import Vocabulary from sogou_mrc.model.bert_coqa import BertCoQA import logging import sys
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
coqa_reader = CoQAReader(-1) data_folder = 'data/' eval_filename = 'input.json' vocab = Vocabulary(do_lowercase=True)
eval_data = coqa_reader.read(data_folder + eval_filename, 'dev') print(eval_data)
vocab.build_vocab(eval_data)
evaluator = CoQAEvaluator(data_folder + eval_filename) bert_dir = 'uncased' bert_data_helper = BertDataHelper(bert_dir)
eval_data = bert_data_helper.convert(eval_data, data='coqa')
eval_batch_generator = BatchGenerator(vocab, eval_data, training=False, batch_size=1, additional_fields=['input_ids', 'segment_ids', 'input_mask', 'start_position', 'end_position', 'question_mask', 'rationale_mask', 'yes_mask', 'extractive_mask', 'no_mask', 'unk_mask', 'qid'])
model = BertCoQA(bert_dir=bert_dir, answer_verification=True)
del eval_data, vocab out = model.evaluate(eval_batch_generator, evaluator) print(out)
Input.json: { "version": "1.0", "data": [ { "source": "mctest", "id": "3dr23u6we5exclen4th8uq9rb42tel", "filename": "mc160.test.41", "story": "Once upon a time, in a barn near a farm house, there lived a little white kitten named Cotton. Cotton lived high up in a nice warm place above the barn where all of the farmer's horses slept. But Cotton wasn't alone in her little home above the barn, oh no. She shared her hay bed with her mommy and 5 other sisters. All of her sisters were cute and fluffy, like Cotton. But she was the only white one in the bunch. The rest of her sisters were all orange with beautiful white tiger stripes like Cotton's mommy. Being different made Cotton quite sad. She often wished she looked like the rest of her family. So one day, when Cotton found a can of the old farmer's orange paint, she used it to paint herself like them. When her mommy and sisters found her they started laughing. \n\n"What are you doing, Cotton?!" \n\n"I only wanted to be more like you". \n\nCotton's mommy rubbed her face on Cotton's and said "Oh Cotton, but your fur is so pretty and special, like you. We would never want you to be any other way". And with that, Cotton's mommy picked her up and dropped her into a big bucket of water. When Cotton came out she was herself again. Her sisters licked her face until Cotton's fur was all all dry. \n\n"Don't ever do that again, Cotton!" they all cried. "Next time you might mess up that pretty white fur of yours and we wouldn't want that!" \n\nThen Cotton thought, "I change my mind. I like being special".", "questions": [ { "input_text": "What is the polynomial theorem?", "turn_id": 1 } ], "answers": [ { "span_start": 59, "span_end": 93, "span_text": "a little white kitten named Cotton", "input_text": "white", "turn_id": 1 } ] }]}
The answer comes out as tensors and I am asking how to decode them.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
I have successfully passed the data through the model, though I got an output of tensors from the inference in trainer.py. How should I decode these tensors into a string that contains the answer? Thanks.