Closed datalee closed 3 years ago
Hey sorry for the late reply, I actually had to dig into the code a bit. So there are two ways to load models.
The easiest is just to load a QAInferencer with your model name supplied. Then it downloads and converts everything you need. You can do that with:
infer = QAInferencer.load(model_name_or_path="wptoux/albert-chinese-large-qa", task_type="question_answering", gpu=True)
Then you can save and load that Inferencer with .save(foo/bar) and .load(foo/bar)
If your transformers model happens to be saved locally (e.g. with git lfs cloning) and you do not have internet access at that machine you have to construct the FARM Reader model a bit. Here is the code for it:
from farm.modeling.adaptive_model import AdaptiveModel
from farm.utils import initialize_device_settings
from farm.infer import Inferencer, QAInferencer
from farm.data_handler.processor import SquadProcessor
from farm.modeling.tokenization import Tokenizer
from pathlib import Path
device, n_gpu = initialize_device_settings(use_cuda=True)
model_name = "local_models/xlm-roberta-base-squad2"
# Load a Tokenizer
tokenizer = Tokenizer.load(
pretrained_model_name_or_path=model_name
)
# stick it into a Processor
label_list = ["start_token", "end_token"]
processor = SquadProcessor(
tokenizer=tokenizer,
max_seq_len=256,
label_list=label_list,
data_dir=Path("../data/squad20"),
)
# Convert the local transformers model to FARM style
model = AdaptiveModel.convert_from_transformers(model_name,device=device,task_type="question_answering")
# Put everything into a Inferencer - at this point you have a ReaderModel
infer = Inferencer(model=model,processor=processor,task_type="question_answering")
# Testing the Reader Model
QA_input = [
{
"questions": ["Who counted the game among the best ever made?"],
"text": "Twilight Princess was released to universal critical acclaim and commercial success. It received perfect scores from major publications such as 1UP.com, Computer and Video Games, Electronic Gaming Monthly, Game Informer, GamesRadar, and GameSpy. On the review aggregators GameRankings and Metacritic, Twilight Princess has average scores of 95% and 95 for the Wii version and scores of 95% and 96 for the GameCube version. GameTrailers in their review called it one of the greatest games ever created."
}]
print(infer.inference_from_dicts(dicts=QA_input)[0])
Btw what are you using this model for? Are you using it in haystack?
@Timoeller use the 1st way,i have the error:
OSError: Can't load config for 'wptoux/albert-chinese-large-qa'. Make sure that: - 'wptoux/albert-chinese-large-qa' is a correct model identifier listed on 'https://huggingface.co/models' - or 'wptoux/albert-chinese-large-qa' is the correct path to a directory containing a config.json file
@Timoeller use the 2th way, also have the error:
01/28/2021 10:39:11 - INFO - farm.modeling.tokenization - Loading tokenizer of type 'AlbertTokenizer' Traceback (most recent call last): File "../cov_demo.py", line 13, in <module> pretrained_model_name_or_path=model_name File ".\Anaconda3\lib\site-packages\farm\modeling\tokenization.py", line 83, in load ret = AlbertTokenizer.from_pretrained(pretrained_model_name_or_path, keep_accents=True, **kwargs) File ".\Anaconda3\lib\site-packages\transformers\tokenization_utils_base.py", line 1428, in from_pretrained return cls._from_pretrained(*inputs, **kwargs) File ".\Anaconda3\lib\site-packages\transformers\tokenization_utils_base.py", line 1575, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) File ".\Anaconda3\lib\site-packages\transformers\tokenization_albert.py", line 155, in __init__ self.sp_model.Load(vocab_file) File ".\AppData\Roaming\Python\Python36\site-packages\sentencepiece.py", line 118, in Load return _sentencepiece.SentencePieceProcessor_Load(self, filename) TypeError: not a string
my transformers model like this:
`
Hey I can replicate your problems with wptoux/albert-chinese-large-qa
As they state in the model card you have to use the BertTokenizer.
Important: use BertTokenizer
So you can load the model like:
infer = QAInferencer.load(model_name_or_path="wptoux/albert-chinese-large-qa", tokenizer_class="BertTokenizer", task_type="question_answering", gpu=True)
Hey I can replicate your problems with
wptoux/albert-chinese-large-qa
As they state in the model card you have to use the BertTokenizer.
Important: use BertTokenizer
So you can load the model like:
infer = QAInferencer.load(model_name_or_path="wptoux/albert-chinese-large-qa", tokenizer_class="BertTokenizer", task_type="question_answering", gpu=True)
not work,you can try
It works for me, that is why I posted it.
Which FARM version are you using? Do you have all requirements installed? What is your error?
It works for me, that is why I posted it.
Which FARM version are you using? Do you have all requirements installed? What is your error?
farm 0.5.0 farm-haystack 0.6.0
Please update to latest versions haystack 0.7.0 and try again.
If you encounter a problem in haystack please also raise the issue there, it is easier to track progress, help you and let others find the solution to your problem as well.
Please update to latest versions haystack 0.7.0 and try again.
If you encounter a problem in haystack please also raise the issue there, it is easier to track progress, help you and let others find the solution to your problem as well.
en,in chinese , must add a mirror
AutoModel.from_pretrained('bert-base-uncased', mirror='tuna')
Nice, thanks for the update.
So the issue is fixed? Closing now. Feel free to re open, or open an issue in haystack when it is related to the Reader Models there.
model like thishttps://huggingface.co/wptoux/albert-chinese-large-qa
Edit Timo: So the problem was accessing the HF modelhub from China. There you must add a mirror
as datalee pointed out below