Open Emulator000 opened 1 year ago
Also, we should upload the Rust ot
model into Huggingface's space in order to work!
Thank you @Emulator000 for identifying the issue with T5 models and submitting this PR. I have opened a small PR to address the issue without commenting out the input prompt filtering that is needed for decoder models. The suggested changes should work for both encoder-decoders (such as T5) and decoders (such as DialoGPT) - please have a look and merge if this is fine - this should update this PR at the same time.
It would be good to have the rust-version of the GODEL models pushed to the Hugging Face model hub before this gets merged since they are registered in the library. Can you please submit a PR to add the rust weights to the upstream repositories?
When this is completed it would also be great to have a test with the smallest GODEL model, please consider including one as part of the T5 test suite.
Thanks!
please have a look and merge if this is fine - this should update this PR at the same time.
Definitely good and it works, I just tested it!
It would be good to have the rust-version of the GODEL models pushed to the Hugging Face model hub before this gets merged since they are registered in the library. Can you please submit a PR to add the rust weights to the upstream repositories?
I've already converted it from Microsoft's original weights but how can I push into their upstream repository? Where can I open the PR?
When this is completed it would also be great to have a test with the smallest GODEL model, please consider including one as part of the T5 test suite.
Sure, I'll add tests for this right now 😄
@guillaume-be tests added!
The only weird thing that I see here is the output from GODEL (you can see in the tests), for some strange reason, every generated texts starts with a blank space
and I don't understand why, is something related to the padding?
@guillaume-be should we have to ping Microsoft someway in order to get PRs merged?
@Emulator000 yes it seems notifications sometimes don't get through in the model hub. What has worked well in the past is leaving an issue on the model repo (in this case https://github.com/microsoft/GODEL/issues) and hope to be able to reach the Hugging Face repository organization owners.
Just opened an issue here 💪🏻
This PR adds GODEL support as mentioned in #324 issue.
Notes I just tested with a local download copy of the GODEL base model and it works except if I change this line: https://github.com/guillaume-be/rust-bert/blob/9fd7983878d500cd0d6abc2fc13696aa8618bfc6/src/pipelines/conversation.rs#L918
Unfortunately it panics with:
If instead for testing purposes I put something like:
It works greatly!
Could someone understand why we have this different behavior between GPT2 and T5 models? Is like we don't have the input sequence in the generated sequence so we don't have to remove it with padding.