[20210606] Weekly AI ArXiv 만담

jungwoo-ha commented 3 years ago

AI News
- Interspeech2021 notification 모두 축하드립니다.
- ICASSP 2021 (6.6 ~ 11): https://www.2021.ieeeicassp.org/TechnicalProgram.asp
- Wudao: 1.75T (MoE) multimodal pretrained model by BAAI: https://www.globaltimes.cn/page/202106/1225172.shtml
- KLUE Day
- SWE Social for enabling HyperCLOVA
- 리비아 내전에 투입된 AI 살상무기: https://www.chosun.com/international/international_general/2021/06/05/Z5CY66TKTJFFBCZDB7K7GU4GXQ/
Arxiv
- ByT5: Towards a token-free future with pre-trained byte-to-byte models
- Tokenizer (데이터 약간써서 학습) 대신 bytestream을 그대로 입력으로 활용한다.
- 1byte = 256 embedding --> vocabulary size를 굉장히 줄여서 나머지 파라미터를 transformer 본체에 활용한다.
- 전반적으로 정확도는 오르고 속도가 아주 미세하게 느려진다 (sequence가 길어짐?)
- 실험은 mT5와 결합해서 다양한 데이터에 적용한다.
- https://github.com/google-research/byt5 (모델 checkpoint까지 포함)
- You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection
- ViT를 조금만 수정해서 만든 object detector (input에 CLS토큰 대신 100개의 DET token 대체, bipartite matching loss사용)
- pretraining scheme에 따른 성능차이가 있어서 다양하게 비교 실험했다고..
- https://github.com/hustvl/YOLOS
- Noise Doesn't Lie: Towards Universal Detection of Deep Inpainting
- Inpainting image detection 모델 (IJCAI2021)
- 변경된 부분의 mask를 잡아낼 수 있음
- inpainting model로 dataset을 생성하고 그걸 활용해서 마스크 디텍팅 하는 Noise-Image Cross-Fusion Network 제안
- 일단 잡아내는 건 잘 잡는듯 한데 마스크 스타일이 아니라 blending 스타일이면 (stargan v2 처럼) 어떻게....
- BERT meets LIWC: Exploring State-of-the-Art Language Models for Predicting Communication Behavior in Couples' Conflict Interactions
- 커플들끼리 말싸움 할 때 커뮤니케이션 행동의 긍부정 상황 예측??
- 심리학 영역인데 기존엔 LIWC (Language Inquiry and Word Counting) 가 주로 활용되었다고
- 대상은 독일어 쓰는 스위스 커플들...
- TF-IDF든 BERT든 쓰면 더 정확하게 잘된다는... (4주후에 뵙겠습니다 안해도 되는건가요..)
- Luna: Linear Unified Nested Attention
- MHSA의 quadratic 을 linear로 변경하기 위한 또다른 방법 (from USC+CMU+Facebook AI)
- SA를 두개의 nested SA로 분할하고 positional embedding을 별도의 fixed length query로 따로 뺀다.
- 속도보면 Performer랑 비슷한데 sequence 길이가 길때 좀더 효과적이고 메모리는 확실히 적게쓴다고.
- 정확도류의 성능은 경쟁력 높은 편
- Men Are Elected, Women Are Married: Events Gender Bias on Wikipedia
- Wikipedia 문서들 중 유명인들의 이벤트를 중심으로 corpus bias 분석한 결과 (ACL 2021)
- 최근 진행되는 여러 training corpus 의 bias 분석을 위한 노력들 중 하나로 protocol 등 관련연구하시는 분들 참조하면 좋을 듯
- NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-Based Simulation
- Task-oriented dialog 데이터 생성을 위한 model-based simulation method
- Collector (NL goal instruction과 API call로 데이터 만들고) & Labeler (PLM으로 annotation) 로 구성
- MultiWoZ Zero-shot 에서 좋은 성능을 보여줍니다.
- https://github.com/naver-ai/neuralwoz (이번주에 open 예정, 코드 검수 중)
- E2E-VLP: End-to-End Vision-Language Pre-training Enhanced by Visual Learning
- E2E visual-linguistic pretraining (기존 웍들은 거의 여러단계) (ACL2021 from Alibaba)
- Pretraining은 MLM, object detection, attribute prediction, caption generation 등
- 데이터는 COCO랑 Visual Genome 갖고 pretraning 하고 VQA등이 downstream
- Defending against Backdoor Attacks in Natural Language Generation
- NLG 특히 대화와 번역쪽에서 훈련데이터에 trigger word를 집어넣는 식으로 가능
- 이를 detecting하고 방어하기 위한 방법도 제안
- 붉은 색으로 malicious 한 표현을 별도 표기하는 센스까지.
- LyricJam: A system for generating lyrics for live instrumental music
- live audio stream이 입력으로 주어지면 가사를 리얼타임으로 뱉어 내는 시스템!
- 1) adversarial alignment of latent representations of audio and lyrics 2) to transfer the topology from the music latent space to the lyric latent space.
- 랭킹은 BERT로 한다고

nick-jhlee commented 3 years ago

Transformers are Deep Infinite-Dimensional Non-Mercer Binary Kernel Machines
- UC Berkeley
- (수학적으로) 정확히 왜 transformer가 잘 될까??
- dot-product attention <=> kernel learning on a pair of Banach spaces
- transformer의 kernel이 사실 infinite-dimensional feature에 기반! <-- 이거 땜에 잘 되는게 아닐까...?
- 새로운 representer theorem, 새로운 universal approximation theorem
Gotta Go Fast When Generating Data with Score-Based Models
- University of Montreal, Simon Fraser University, Radboud University & Donders Institute
- score-based generative modeling이 너무 느리다,,,,
- Why? numerical SDE solver (e.g. Euler-Maruyama)가 너무 느리기 때문!
- 새로운 SDE solver를 제안함!
- image quality는 같거나 더 좋으면서, 속도는 2~10x 빠름!
Are Convolutional Neural Networks or Transformers more like human vision?
- Princeton, Google DeepMind
- Transformer가 CNN을 넘은진 오래되었지만, 과연 뭐가 좀 더 사람과 비슷하게 작동할까?
- ViT가 사람과 좀 더 비슷한 error pattern을 나타내고, 좀 더 많은 shape bias를 나타냄
- (CNN은 shape보단 오히려 texture를 기반으로 classify를 한다는 연구가 있었음: Baker et al., 2019)
- 결론: ViT가 CNN보다 accuracy도 높고, 좀 더 사람같음
Implicit Representations of Meaning in Neural Language Models
- MIT
- (못읽었는데, 설명 부탁드립니다.... ㅎㅎㅎㅎ)

hollobit commented 3 years ago

인공지능(AI)을 발명가로 기재한 특허출원, 가능할까? https://www.epnc.co.kr/news/articleView.html?idxno=209585

[AI] 연평균 14% 크는데…AI 기업, 구인난 심각 https://www.edaily.co.kr/news/read?newsId=01853206629079096&mediaCodeNo=257

2020 인공지능 산업실태조사 보고서 https://spri.kr/posts/view/23214?code=research

中 최초 딥 러닝 '사이버 학생' 탄생, 美와 AI 경쟁 성큼 https://www.hankookilbo.com/News/Read/A2021060416260001864

현실로 다가오는 액체 인공지능 뇌...보다 유연한 AI 개발을 위한 도전 http://www.aitimes.com/news/articleView.html?idxno=138840

Inside the $5 Million Competition Defining the Future of Artificial Intelligence https://www.wired.com/sponsored/story/inside-the-dollar5-million-competition-defining-the-future-of-artificial-intelligence/

nick-jhlee commented 3 years ago

AI-DLDA 2021 International Summer School on A.I.

Dr. Jung-Woo Ha님 speaker 축하드립니다 ㅎ,,,,

jwlee-ml commented 3 years ago

https://clublink.to/event/mJo8VqGO?ref=fb

PR-12 in 클하 #2 합니다. 지금까지 나온 self-supervised learning의 흐름들을 정리하고, 자유롭게 이야기해보는 시간을 가지려고 합니다 !! 6월 9일 오후 10시 클럽하우스에서 많은 이야기 나누어요 🙂

Clyde21c commented 3 years ago

https://trajectory-transformer.github.io/ 20210607_104444

'Trajectory Transformer'
UC Berkeley, Sergey Levine
model-based RL에서 world model을 학습할 때 Transformer를 사용 --> 기존의 single-step prediction보다 long-term 예측을 정밀하게 가능
유사한 접근의 연구를 하루 전에 Pieter Abbeel이 포함된 UC Berkeley, Facebook AI Research, Google Brain 연구그룹에서 발표
'Decision Transformer: Reinforcement Learning via Sequence Modeling' https://sites.google.com/berkeley.edu/decision-transformer

jungwoo-ha / WeeklyArxivTalk

[20210606] Weekly AI ArXiv 만담 #12