NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT
Apache License 2.0
5.87k stars 893 forks source link

How to understand the codebase, how to apply my python functions into this code #403

Open duongkstn opened 1 year ago

duongkstn commented 1 year ago

Hi, I am a python and Huggingface's transformers user. My model is based on T5/BART but with further generating function implementaions by myself in python (add more LogitsProcessor functions in Huggingface code ). How can I use my functions along with FasterTransformer properly? It is really hard for me to read C++ code and understand the flow. Any advice ?

byshiue commented 1 year ago

There are two choices:

  1. Add your logitProcessor into c++ code, this requires to write some cuda kernel.
  2. Encapsulate the transformer block as a python op. You can refer the GptDecoderOp. This requires to understand the implementation details of FT.
TheExGenesis commented 1 year ago

There are two choices:

  1. Add your logitProcessor into c++ code, this requires to write some cuda kernel.
  2. Encapsulate the transformer block as a python op. You can refer the GptDecoderOp. This requires to understand the implementation details of FT.

Would adding a logit processor require writing a cuda kernel, or could it be plain C++? e.g. if it's just masking some weigh

byshiue commented 1 year ago

All operation in FT are on CUDA.