NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT
Apache License 2.0
5.77k stars 882 forks source link

How to use stop_words_list, bad_words_list in gpt-2,pytorch #354

Open AnShengqiang opened 1 year ago

AnShengqiang commented 1 year ago

Hello, thank you for providing such a great tool. 👍

We see these two parameters (stop_words_list, bad_words_list) on this page and use this code (link) to add them, but it doesn't take effect.

We need this feature and hope to be able to use it,thanks~

byshiue commented 1 year ago

You can converter the string of stop/bad words to ids by https://github.com/NVIDIA/FasterTransformer/blob/main/examples/pytorch/gpt/utils/word_list.py

More examples are in fastertransformer_backend (https://github.com/triton-inference-server/fastertransformer_backend/blob/main/docs/gpt_guide.md, https://github.com/triton-inference-server/fastertransformer_backend/blob/main/tools/gpt/end_to_end_test.py)

nrakover commented 1 year ago

@byshiue is there any explicit documentation anywhere for these parameters? While the links you shared are helpful for pattern matching, I haven't found any explanation of the semantics of stop_words_list nor a description of how to interpret the parameter format? For instance, what does the "offset" represent? Thanks in advance!

samuelbacaner commented 1 year ago

For all those that want to understand the stop_words_list, take a look at this detailed description here.

llsj14 commented 1 year ago

Thank you for the detailed guidance. I had the same problem and your solution helped me a lot.

In addition, since stop_words_list and bad_words_list are transformed into torch.Tensor type in fastertransformer_backend, I also had to change the type of bad_words_list using torch.IntTensor().cuda() as an argument of ParallelGptOp::forward function in FasterTransformer.

torch.IntTensor(to_word_list_format(bad_words_dict)).cuda()