guillaume-be / rust-bert

Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)
https://docs.rs/crate/rust-bert
Apache License 2.0
2.67k stars 216 forks source link

Added GODEL support with T5 model type #376

Open Emulator000 opened 1 year ago

Emulator000 commented 1 year ago

This PR adds GODEL support as mentioned in #324 issue.

Notes I just tested with a local download copy of the GODEL base model and it works except if I change this line: https://github.com/guillaume-be/rust-bert/blob/9fd7983878d500cd0d6abc2fc13696aa8618bfc6/src/pipelines/conversation.rs#L918

Unfortunately it panics with:

thread 'main' panicked at 'range start index 6 out of range for slice of length 5', /home/dario/projects/rust-bert/src/pipelines/conversation.rs:940:43

If instead for testing purposes I put something like:

let generated_response = generated_sequence.as_slice();

It works greatly!

Could someone understand why we have this different behavior between GPT2 and T5 models? Is like we don't have the input sequence in the generated sequence so we don't have to remove it with padding.

Emulator000 commented 1 year ago

Also, we should upload the Rust ot model into Huggingface's space in order to work!

guillaume-be commented 1 year ago

Thank you @Emulator000 for identifying the issue with T5 models and submitting this PR. I have opened a small PR to address the issue without commenting out the input prompt filtering that is needed for decoder models. The suggested changes should work for both encoder-decoders (such as T5) and decoders (such as DialoGPT) - please have a look and merge if this is fine - this should update this PR at the same time.

It would be good to have the rust-version of the GODEL models pushed to the Hugging Face model hub before this gets merged since they are registered in the library. Can you please submit a PR to add the rust weights to the upstream repositories?

When this is completed it would also be great to have a test with the smallest GODEL model, please consider including one as part of the T5 test suite.

Thanks!

Emulator000 commented 1 year ago

please have a look and merge if this is fine - this should update this PR at the same time.

Definitely good and it works, I just tested it!

It would be good to have the rust-version of the GODEL models pushed to the Hugging Face model hub before this gets merged since they are registered in the library. Can you please submit a PR to add the rust weights to the upstream repositories?

I've already converted it from Microsoft's original weights but how can I push into their upstream repository? Where can I open the PR?

When this is completed it would also be great to have a test with the smallest GODEL model, please consider including one as part of the T5 test suite.

Sure, I'll add tests for this right now 😄

Emulator000 commented 1 year ago

Nevermind for the upstream, already opened two PRs for the GODEL's base and large models:

Emulator000 commented 1 year ago

@guillaume-be tests added!

The only weird thing that I see here is the output from GODEL (you can see in the tests), for some strange reason, every generated texts starts with a blank space and I don't understand why, is something related to the padding?

Emulator000 commented 1 year ago

@guillaume-be should we have to ping Microsoft someway in order to get PRs merged?

guillaume-be commented 1 year ago

@Emulator000 yes it seems notifications sometimes don't get through in the model hub. What has worked well in the past is leaving an issue on the model repo (in this case https://github.com/microsoft/GODEL/issues) and hope to be able to reach the Hugging Face repository organization owners.

Emulator000 commented 1 year ago

Just opened an issue here 💪🏻