GPT NeoX 20B - Githubissues

triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html

BSD 3-Clause "New" or "Revised" License

8.03k stars 1.44k forks source link

GPT NeoX 20B #4408

Closed rtalaricw closed 1 year ago

rtalaricw commented 2 years ago

Can a converter for GPT NeoX 20B be written to convert from PyTorch to Triton format?

krishung5 commented 2 years ago

@CoderHam Are you familiar enough with this to provide an answer?

dyastremsky commented 1 year ago

Closing due to inactivity. Please let us know if you would like to reopen the issue for follow-up.

CoderHam commented 1 year ago

For posterity, Triton is not a format. The format used by PyTorch backend is TorchScript. Please refer to the PyTorch docs or open an issue on their github repo.

leemgs commented 1 year ago

FYI,

https://github.com/NVIDIA/FasterTransformer/tree/main/examples/pytorch/gptneox/utils