MAIF / melusine

📧 Melusine: Use python to automatize your email processing workflow
https://maif.github.io/melusine
Other
352 stars 58 forks source link

Bug NeuralModel ? #148

Closed aagostinelli86 closed 10 months ago

aagostinelli86 commented 1 year ago

Discussed in https://github.com/MAIF/melusine/discussions/147

Originally posted by **aagostinelli86** January 3, 2023 Hello @TFA-MAIF I have an issue with the tutorial notebook n 13, cell 33 (Camembert model): "OSError: Model name 'jplu/tf-camembert-base' was not found in tokenizers model name list (camembert-base). We assumed 'jplu/tf-camembert-base' was a path, a model identifier, or url to a directory containing vocabulary files named ['sentencepiece.bpe.model'] but couldn't find such vocabulary files at this path or url." I think it's due to the proxy, so I downloaded locally the pretrained model (config.json + .tf_model.h5) and I specified its path in your NeuralModel wrapper. The issue holds with this different message: "OSError: Model name 'C:\Users\vgkj536\git_projects\melusine\tutorial\jplu\tf-camembert-base' was not found in tokenizers model name list (camembert-base). We assumed 'C:\Users\vgkj536\git_projects\melusine\tutorial\jplu\tf-camembert-base' was a path, a model identifier, or url to a directory containing vocabulary files named ['sentencepiece.bpe.model'] but couldn't find such vocabulary files at this path or url." On the opposite, if I call directly the TFCamembertModel.from_pretrained() with this local path, it works well. Is this due to a bug ? ![image](https://user-images.githubusercontent.com/16646417/210375984-3b2b11af-bd62-4e2e-b2a7-4ae3b02babda.png)
TFA-MAIF commented 1 year ago

Hello,

Calling directly jplu's model (available on Huggingface) is not possible if you have proxy. Indeed if you want it to work, you need to have your bert model locally and call your local path. I will let my colleagues in charge of support (ex: @hugo-quantmetry) assists you to dive into your bug.

Best regards, Tiphaine

HugoPerrier commented 10 months ago

New melusine v3.0.0 is out. Closing legacy issues