AkihikoWatanabe / paper_notes

たまに追加される論文メモ
https://AkihikoWatanabe.github.io/paper_notes
16 stars 0 forks source link

Sohu, etched, 2024.06 #1399

Open AkihikoWatanabe opened 2 days ago

AkihikoWatanabe commented 2 days ago

https://www.etched.com/announcing-etched

AkihikoWatanabe commented 2 days ago

By burning the transformer architecture into our chip, we can’t run most traditional AI models: the DLRMs powering Instagram ads, protein-folding models like AlphaFold 2, or older image models like Stable Diffusion 2. We can’t run CNNs, RNNs, or LSTMs either.

transformer以外の大抵のモデルでは動作しないが、代わりにH-100よりも20倍早いinferenceを実現できるチップらしい。 image

With over 500,000 tokens per second in Llama 70B throughput, Sohu lets you build products impossible on GPUs.

いやいやいやLlama-70Bで0.5M Token/secは早すぎる!!!