Inference / LeanGPT.generate

learning-at-home / lean_transformer

Memory-efficient transformer. Work in progress.

MIT License

19 stars 3 forks source link

Open justheuristic opened 2 years ago

justheuristic commented 2 years ago

This is a master discussion for memory-efficient inferencing, further notes will be added shortly

Current quest stage: add a dummy cache that is passed to all attention layers

justheuristic commented 2 years ago