ili0820 commented 2 years ago

News

-하츠네 미쿠와 결혼한 日지방공무원

Arxiv

Solving ImageNet: a Unified Scheme for Training any Backbone to Top Results

Neighborhood Attention Transformer

“Does it come in black?” CLIP-like models are zero-shot recommenders

Etc.

hawe66 commented 2 years ago

Arxiv

DeepMind Neuroscience Series - 1

Social Diversity and Social Preferences in Mixed-Motive Reinforcement Learning

sunbi-s commented 2 years ago

News

naem1023 commented 2 years ago

Transformers4Rec

https://github.com/NVIDIA-Merlin/Transformers4Rec https://nvidia-merlin.github.io/Transformers4Rec/main/model_definition.html

RQ Transformer

https://github.com/kakaobrain/rq-vae-transformer text-to-image model.

minDALL-E: OpenAI의 DALL-E RQ-Transformer: 3차원의 코드 맵으로 표현된 이미지를 순차적으로 예측해 생성하도록 학습된 이미지 생성 모델. 계산비용 줄이고, 속도가 빨라졌다.

Grad Cache

https://github.com/luyug/GradCache

Gradient accumluation은 contrastive learning에서 적용할 수 없다. mini-batch 단위로 batch-wise하게 loss를 계산하는 일반적인 classification 기법이라면 gradient accumulation이 적용 가능하다. 단순하게 loss 업데이트를 n번 미루고 한꺼번에 loss 업데이트를 진행하면 되기 때문이다.

하지만 contrastive learning은 batch-wise하게 lose를 계산하고 업데이트한다. 따라서 서로 다른 batch의 loss를 한꺼번에 취합해서 update한다면 contrastive learning의 의미가 깨져버린다. 따라서 contrastive learning에서도 마치 gradient accumulation과 같은 기법이 필요한데, 이를 Grad Cache로 해결한 것이다. https://seopbo.github.io/gradCache/

ili0820 commented 2 years ago

고생하셨습니다~

ili0820 / Boostcamp-AI-Casual-Talk

[202204020] Boostcamp-AI-Casual-Talk - 4회차 #5

News

Arxiv

Etc.

Arxiv

DeepMind Neuroscience Series - 1

News

Arixv

Transformers4Rec

RQ Transformer

Grad Cache