Closed JiyangZhang closed 2 years ago
Hi, the work is still in progress. Hopefully, we will be finished by mid-November. We will make an announcement in this repository. Stay tuned!
Hi,
I also encountered this problem and would like to know whether the pre-trained model is ready for use through the Transformers library?
Thanks a lot!
PLBart is available from transformers v4.17.0, check the release note. We will be making formal announcement soon.
from transformers import PLBartTokenizer, PLBartForConditionalGeneration
def load_model_and_tokenizer(model_name_or_path):
tokenizer = PLBartTokenizer.from_pretrained(model_name_or_path, language_codes="base")
model = PLBartForConditionalGeneration.from_pretrained(model_name_or_path)
return model, tokenizer
def translate(
model_name_or_path,
input_sequences,
src_lang=None,
tgt_lang=None,
max_generation_length=128,
num_beams=10,
num_return_sequences=1
):
model, tokenizer = load_model_and_tokenizer(model_name_or_path)
if src_lang:
tokenizer.src_lang = src_lang
decoder_start_token_id = None
if tgt_lang:
decoder_start_token_id = tokenizer.lang_code_to_id[tgt_lang]
inputs = tokenizer(input_sequences, return_tensors='pt', padding=True)
outputs = model.generate(
**inputs,
decoder_start_token_id=decoder_start_token_id,
max_length=max_generation_length,
num_beams=num_beams,
num_return_sequences=num_return_sequences,
)
return tokenizer.batch_decode(outputs, skip_special_tokens=True)
src_lang, tgt_lang = "java", "en_XX"
model_name_or_path = "uclanlp/plbart-java-en_XX"
input = "static int sumDigits ( int no ) { return no == 0 ? 0 : no % 10 + sumDigits ( no / 10 ) ; }"
outputs = translate(
model_name_or_path, [inputs], src_lang, tgt_lang
)
print("\n".join(outputs)) # ["Returns the sum of digits of the given number."]
Hi,
I noticed that you uploaded the model to hugging face transformers library which is super exciting! However, when I tried to use it through transformers with the provided guidance, I got the following error:
>>> model = AutoModelForSeq2SeqLM.from_pretrained("uclanlp/plbart-base") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/anaconda3/envs/mdt/lib/python3.7/site-packages/transformers/models/auto/auto_factory.py", line 397, in from_pretrained pretrained_model_name_or_path, return_unused_kwargs=True, **kwargs File "/home/anaconda3/envs/mdt/lib/python3.7/site-packages/transformers/models/auto/configuration_auto.py", line 529, in from_pretrained config_class = CONFIG_MAPPING[config_dict["model_type"]] File "/home/anaconda3/envs/mdt/lib/python3.7/site-packages/transformers/models/auto/configuration_auto.py", line 278, in __getitem__ raise KeyError(key) KeyError: 'plbart'
This is the command I used:from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("uclanlp/plbart-base")
model = AutoModelForSeq2SeqLM.from_pretrained("uclanlp/plbart-base")
Here is my configuration: python: 3.7.11 pytorch: 1.9.1 transformers: 4.11.3
Thanks.