Open scene-the-ella opened 1 year ago
최근 페이스북이 공유한 LLaMA의 더 최신 이야기 입니다.
High-resolution image reconstruction with latent diffusion models from human brain activity BioArXiv: https://www.biorxiv.org/content/10.1101/2022.11.18.517004 Website: https://sites.google.com/view/stablediffusion-with-brain/
어제부터 트위터에서 크게 화제가 된 논문을 공유해드립니다. 뇌의 fMRI 신호에서 L2 regularized linear model(???)을 학습해 stable diffusion의 image와 text latent encoding에 맞추는 모델을 만들었을 때 대상에게 보여준 영상과 유사한 영상을 복원할 수 있다는 것을 보여준 연구입니다.
각 대상마다 수천장의 영상이 있어야하며 한 모델은 한 대상에게만, 그리고 아마 한 장치에 대해서만 사용 가능할 것으로 예상하지만 뇌파 정보에서 딥러닝 학습을 이용하지 않고 pretrained model과 linear model 학습만을 통해 복원 가능하다는 것을 보여주어 파급력이 매우 클 것으로 생각됩니다. 다만, 재현성을 확인해야지만 신뢰 가능할 것 같습니다.
Dropout Reduces Underfitting ArXiv: https://arxiv.org/abs/2303.01500 GitHub: https://github.com/facebookresearch/dropout
Consistency Models ArXiv: https://arxiv.org/abs/2303.01469
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages ArXiv: https://arxiv.org/abs/2303.01037
Full Stack Optimization of Transformer Inference: a Survey ArXiv: https://arxiv.org/abs/2302.14017
Transformer 모델의 최적화에 대한 하드웨어 및 소프트웨어 최적화 및 이슈에 대해 잘 정리된 survey paper 공유해드립니다.
- user simulator를 이용하여 interactive한 text generation 학습
- generation model은 one-shot에 생성하기 때문에 hallucination과 같은 문제가 있어서 interactive한 방법이 중요함을 강조
News
Conferences
ChatGPT가 촉발한 초거대 AI시대 우리의 대응 전략
AI미래포럼 초거대 AI 웨비나 시리즈2: 초거대 AI 비즈니스 생태계에 관하여
POE: 4가지 초거대 언어모델 기반 챗봇 체험 - Claude 도 써볼 수 있어요!
Must read: the 100 most cited AI papers in 2022 2022 1️⃣ AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models -> (From DeepMind, 1372 citations) Using AlphaFold to augment protein structure database coverage.
2️⃣ ColabFold: making protein folding accessible to all -> (From multiple institutions, 1162 citations) An open-source and efficient protein folding model.
3️⃣ Hierarchical Text-Conditional Image Generation with CLIP Latents -> (From OpenAI, 718 citations) DALL·E 2, complex prompted image generation that left most in awe.
4️⃣ A ConvNet for the 2020s -> (From Meta and UC Berkeley, 690 citations) A successful modernization of CNNs at a time of boom for Transformers in Computer Vision.
5️⃣ PaLM: Scaling Language Modeling with Pathways -> (From Google, 452 citations) Google's mammoth 540B Large Language Model, a new MLOps infrastructure, and how it performs.
2021 1️⃣ Highly accurate protein structure prediction with AlphaFold -> (From DeepMind, 8965) AlphaFold, a breakthrough in protein structure prediction using Deep Learning.
2️⃣ Swin Transformer: Hierarchical Vision Transformer using Shifted Windows -> (From Microsoft, 4810 citations) A robust variant of Transformers for Vision.
3️⃣ Learning Transferable Visual Models From Natural Language Supervision -> (From OpenAI, 3204 citations) CLIP, image-text pairs at scale to learn joint image-text representations in a self supervised fashion
4️⃣ On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? -> (From U. Washington, Black in AI, The Aether, 1266 citations) Famous position paper very critical of the trend of ever-growing language models, highlighting their limitations and dangers.
5️⃣ Emerging Properties in Self-Supervised Vision Transformers -> (From Meta, 1219 citations) DINO, showing how self-supervision on images led to the emergence of some sort of proto-object segmentation in Transformers.
2020 1️⃣ An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale -> (From Google, 11914 citations) The first work showing how a plain Transformer could do great in Computer Vision.
2️⃣ Language Models are Few-Shot Learners -> (From OpenAI, 8070 citations) GPT-3, This paper does not need further explanation at this stage.
3️⃣ YOLOv4: Optimal Speed and Accuracy of Object Detection -> (From Academia Sinica, Taiwan, 8014 citations) Robust and fast object detection sells like hotcakes.
4️⃣ Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer -> (From Google, 5906 citations) A rigorous study of transfer learning with Transformers, resulting in the famous T5.
5️⃣ Bootstrap your own latent: A new approach to self-supervised Learning -> (From DeepMind and Imperial College, 2873 citations) Showing that negatives are not even necessary for representation learning.
ArXiv