Open manisnesan opened 10 months ago
Related Rasbt announcement about the book building LLM from scratch X post
Related xformers
The xFormers library is a PyTorch-based library that offers flexible and optimized building blocks for Transformers, allowing the creation of state-of-the-art models[4]. It provides interoperable components that can be combined to build advanced models for various applications[2]. xFormers is known for its efficiency in both inference and training, offering faster speed and reduced memory consumption in attention blocks[2]. Researchers across different domains like NLP and vision benefit from its customizable building blocks, research-oriented components, and efficient design[1]. The library is recommended for its speed, memory efficiency, and the ability to use custom CUDA kernels while dispatching to other libraries when necessary[1].
In summary, xFormers is a versatile library that empowers researchers to build cutting-edge models efficiently by providing customizable building blocks, research-first components, and an emphasis on speed and memory efficiency[1][2][4].
Sources [1] facebookresearch/xformers: Hackable and optimized Transformers ... https://github.com/facebookresearch/xformers [2] xFormers - Hugging Face https://huggingface.co/docs/diffusers/en/optimization/xformers [3] Can someone ELI5 transformers and the “Attention is all we need” paper? https://news.ycombinator.com/item?id=35977891 [4] Welcome to xFormers's documentation! https://facebookresearch.github.io/xformers/ [5] [PDF] USING TRANSFORMERS TO PREDICT CUSTOMER ... https://open.library.ubc.ca/media/stream/pdf/24/1.0422968/4
Source perplexity.ai
https://blog.matdmiller.com/posts/2023-06-10_transformers/notebook.html
X post from Jeremy