BandaiNamcoResearchInc / DistilBERT-base-jp

MIT License
160 stars 12 forks source link

About model_max_length #1

Closed akirasosa closed 4 years ago

akirasosa commented 4 years ago

Hi,

It looks that the model_max_length returns VERY_LARGE_INTEGER (1000000000000000019884624838656). Is this expected result?

model = DistilBertModel.from_pretrained('bandainamco-mirai/distilbert-base-japanese')
tokenizer = DistilBertTokenizer.from_pretrained('bandainamco-mirai/distilbert-base-japanese')
tokenizer.model_max_length
# => 1000000000000000019884624838656

I use huggingface 2.11.0.

Thanks for nice model,

z-lai commented 4 years ago

Hi,

It looks that the model_max_length returns VERY_LARGE_INTEGER (1000000000000000019884624838656). Is this expected result?

model = DistilBertModel.from_pretrained('bandainamco-mirai/distilbert-base-japanese')
tokenizer = DistilBertTokenizer.from_pretrained('bandainamco-mirai/distilbert-base-japanese')
tokenizer.model_max_length
# => 1000000000000000019884624838656

I use huggingface 2.11.0.

Thanks for nice model,

Hi akirasosa, Thank you very much for the notice. We noted there was a significant update of tokenizer policy since transformers 2.9.0, which may lead to unexpected performance of above model.

We are going to update the document for transformers after 2.9.0 in the near future. Anyway, you can try using the previous version as transformers ==2 .8.0. Sorry for your inconvenience.

lai

akirasosa commented 4 years ago

Hi @z-lai , Thanks for the quick response. It's no problem. I will try it.