How can i create smaller sized file for inference of 1.5B model

ConnorJL / GPT2

An implementation of training for GPT2, supports TPUs

MIT License

1.42k stars 338 forks source link

How can i create smaller sized file for inference of 1.5B model #22

Closed pragnakalpdev6 closed 4 years ago

pragnakalpdev6 commented 4 years ago

I am working on gpt2 1.5b model. It is taking too much time for inference how can i decrease the time taken by the model? How can i optimize my model?

ConnorJL commented 4 years ago

Unfortunately, the 1.5B model is just really, really big. You can batch your predictions or run on faster hardware, but there isn't much more you can do. Maybe there is some way to use distillation methods or the like to reduce the model size, but I'm not familiar with any research of doing so with the GPT2 model specifically.