manisnesan / til

collection of today i learned scripts
4 stars 0 forks source link

Transformers / LLMs from scratch #73

Open manisnesan opened 10 months ago

manisnesan commented 10 months ago

https://blog.matdmiller.com/posts/2023-06-10_transformers/notebook.html

X post from Jeremy

manisnesan commented 10 months ago

Related Rasbt announcement about the book building LLM from scratch X post

image

image

manisnesan commented 8 months ago

Related xformers

The xFormers library is a PyTorch-based library that offers flexible and optimized building blocks for Transformers, allowing the creation of state-of-the-art models[4]. It provides interoperable components that can be combined to build advanced models for various applications[2]. xFormers is known for its efficiency in both inference and training, offering faster speed and reduced memory consumption in attention blocks[2]. Researchers across different domains like NLP and vision benefit from its customizable building blocks, research-oriented components, and efficient design[1]. The library is recommended for its speed, memory efficiency, and the ability to use custom CUDA kernels while dispatching to other libraries when necessary[1].

In summary, xFormers is a versatile library that empowers researchers to build cutting-edge models efficiently by providing customizable building blocks, research-first components, and an emphasis on speed and memory efficiency[1][2][4].

Sources [1] facebookresearch/xformers: Hackable and optimized Transformers ... https://github.com/facebookresearch/xformers [2] xFormers - Hugging Face https://huggingface.co/docs/diffusers/en/optimization/xformers [3] Can someone ELI5 transformers and the “Attention is all we need” paper? https://news.ycombinator.com/item?id=35977891 [4] Welcome to xFormers's documentation! https://facebookresearch.github.io/xformers/ [5] [PDF] USING TRANSFORMERS TO PREDICT CUSTOMER ... https://open.library.ubc.ca/media/stream/pdf/24/1.0422968/4

Source perplexity.ai

manisnesan commented 7 months ago

Related Understanding encoder and decoder

image