LYH-YF / MWPToolkit

MWPToolkit is an open-source framework for math word problem(MWP) solvers.
MIT License
162 stars 37 forks source link

How to use a model trained on the math-23k dataset to make Chinese predictions? #30

Closed Godlikemandyy closed 1 year ago

Godlikemandyy commented 1 year ago

I trained the model with the following code: python run_mwptoolkit.py --model=MWPBert --dataset=math23k --equation_fix=prefix --task_type=single_equation --pretrained_model=./pretrain/chinese-bert-wwm-ext --test_step=5 --gpu_id=0 --train_batch_size=32 --epoch_nums=85 --learning_rate=3e-4 --encoding_learning_rate=3e-5 --vocab_level=char. I would like to ask you how to apply the trained model? How do you predict that?

Looking forward to your reply!

LYH-YF commented 1 year ago

We have not directly implemented such a requirement. Maybe you can refer to the following code. If there are errors, you can discuss with me. @Godlikemandyy

# where trained parameters saved
trained_model_dir = "./xxxxxx/xxxx"

# init module
# different models need to initialize different datasets and dataloaders. MWPBert is a pretrain based model so using PretrainDataset and PretrainDataLoader
config = Config.load_from_pretrained(trained_model_dir)
dataset = PretrainDataset.load_from_pretrained(trained_model_dir)
dataloader = PretrainDataLoader(config, dataset)
model = MWPBert(config, dataset).to(config["device"])

# load parameter of model
model_file = os.path.join(trained_model_dir, 'model.pth')
check_pnt = torch.load(model_file, map_location=config["map_location"])
model.load_state_dict(check_pnt["model"])

# question
question = "一件上衣的售价是480元,比原价降低了20%,降价了多少元?"

# prediction  : maybe you can predict more samples at once according to 'test_batch_size'
batch = dataloader.build_batch_for_predict([{"question": question}])
token_logits, symbol_outputs, model_all_outputs = model.predict(batch)
prediction = symbol_outputs[0]

# equation
symbols = dataloader.convert_idx_2_symbol(prediction)
print(symbols)
Godlikemandyy commented 1 year ago

We have not directly implemented such a requirement. Maybe you can refer to the following code. If there are errors, you can discuss with me. @Godlikemandyy

# where trained parameters saved
trained_model_dir = "./xxxxxx/xxxx"

# init module
# different models need to initialize different datasets and dataloaders. MWPBert is a pretrain based model so using PretrainDataset and PretrainDataLoader
config = Config.load_from_pretrained(trained_model_dir)
dataset = PretrainDataset.load_from_pretrained(trained_model_dir)
dataloader = PretrainDataLoader(config, dataset)
model = MWPBert(config, dataset).to(config["device"])

# load parameter of model
model_file = os.path.join(trained_model_dir, 'model.pth')
check_pnt = torch.load(model_file, map_location=config["map_location"])
model.load_state_dict(check_pnt["model"])

# question
question = "一件上衣的售价是480元,比原价降低了20%,降价了多少元?"

# prediction  : maybe you can predict more samples at once according to 'test_batch_size'
batch = dataloader.build_batch_for_predict([{"question": question}])
token_logits, symbol_outputs, model_all_outputs = model.predict(batch)
prediction = symbol_outputs[0]

# equation
symbols = dataloader.convert_idx_2_symbol(prediction)
print(symbols)
image

I ran the code you provided and got the above error.Please also provide the modules you need to import to run the code, thanks!

LYH-YF commented 1 year ago

the line dataset.dataset_load() will solve above error. Preprocessing of the question is required so I add several lines to preprocess data. A word segmentation tool is required and here I use jieba. @Godlikemandyy

import os

import jieba
import torch

from mwptoolkit.config import Config
from mwptoolkit.data.dataloader import PretrainDataLoader
from mwptoolkit.data.dataset import PretrainDataset
from mwptoolkit.model.Seq2Tree.mwpbert import MWPBert
from mwptoolkit.utils.preprocess_tool.number_transfer import number_transfer_single

# where trained parameters saved
trained_model_dir = "./trained_model/MWPBert-math23k"

# init module
# different models need to initialize different datasets and dataloaders. MWPBert is a pretrain based model so using PretrainDataset and PretrainDataLoader
config = Config.load_from_pretrained(trained_model_dir)
dataset = PretrainDataset.load_from_pretrained(trained_model_dir)
dataset.dataset_load()
dataloader = PretrainDataLoader(config, dataset)
model = MWPBert(config, dataset).to(config["device"])

# load parameter of model
model_file = os.path.join(trained_model_dir, 'model.pth')
check_pnt = torch.load(model_file, map_location=config["map_location"])
model.load_state_dict(check_pnt["model"])

# question
question = "一件上衣的售价是480元,比原价降低了20%,降价了多少元?"
# preprocess
sequence = ' '.join(jieba.cut(question))
processed_data = number_transfer_single(
    data={"question": sequence, "equation": ""},
    mask_type=config['mask_symbol'], linear=True, vocab_level=config['vocab_level']
)

# prediction  : maybe you can predict more samples at once according to 'test_batch_size'
batch = dataloader.build_batch_for_predict([processed_data])
token_logits, symbol_outputs, model_all_outputs = model.predict(batch)
prediction = symbol_outputs[0]

# equation
symbols = dataloader.convert_idx_2_symbol(prediction)
print(symbols)
Godlikemandyy commented 1 year ago

I tried the code and it worked. Thanks for your guidance. I also want to know, want to get the postfix expression of the equation, now output is the prefix expression.I modify which parameter can be implemented.

LYH-YF commented 1 year ago

this function @Godlikemandyy

from mwptoolkit.utils.preprocess_tool.equation_operator import from_prefix_to_postfix
Godlikemandyy commented 1 year ago

@LYH-YF It's perfect and efficient!Thanks very much! I have another question for you, for example: I have the following question text, which contains mathematical formulas: 应纳房产税=50×12%×8/12+600×(1-20%)×1.2%×4/12=5.92(万元)。 My purpose is to take out the mathematical formula inside and calculate the result, and then compare it with the result in the formula (5.92). Can MWPBert be implemented or can MWPtoolkit be implemented, and if so, how and with which data set to train the model?I am looking forward to your reply. Thank you very much!

LYH-YF commented 1 year ago

@Godlikemandyy My purpose is to take out the mathematical formula inside and calculate the result, then compare it with the result in the formula. MWPBert or MWPtoolkit may not supported what you want but I think it's just a programming problem. 1) take out, you can match anything between two equivalent symbols. 2) calculate, if the formula is a String type, you need replace some symbols e.g. "×%)(" to defined symbols in programming language e.g. "*/100)(", then calling function eval() is a simple way to calculate. 3) compare, it's better not to use a==b comparing two float numbers, use abs(a-b) < c instead, where c is an allowable calculation error.

Godlikemandyy commented 1 year ago

@LYH-YF I see. Thank you very much for your patience!