[20210711] Weekly AI ArXiv 만담

jungwoo-ha / WeeklyArxivTalk

[Zoom & Facebook Live] Weekly AI Arxiv 시즌2

971 stars 41 forks source link

[20210711] Weekly AI ArXiv 만담 #16

Closed jungwoo-ha closed 3 years ago

jungwoo-ha commented 3 years ago

AI News
- NeurIPS 2021 리뷰 데드라인 coming soon (열심히 합시다)
- 덤으로.. NeurIPS 2022가 New Orleans 에서 열릴 거라는 카더라 통신?
- EMNLP 2021 rebuttal 파이팅
- 과총-재미한인과학자협회-AI미래포럼 공동 웨비나: CV in 스마트시티, 헬스케어, NLP, Hyperscale AI 등 (https://www.youtube.com/watch?v=JdaA9aIEtpU)
AI ArXiv
- Improved Transformer for High-Resolution GANs
- GAN generator 혹은 decoder를 transformer 기반 구조로
- SA를 그대로 쓴건 아니고 일단 퀄을 위해서 (local 정보가 중요하니) 2 phases 로 분리
- phase1: z --> low resolution 에선 Nested transformer,
- phase2: low r --> high r 에선 SA대신 neural implicit repre 스타일로 MLP 적극 활용
- 둘다 latent (Q)와 패치 입력(K, V)에 대한 cross-attention 활용.
- StyleGAN2 대비 효과도 괜찮고 효율성도 괜찮고 ImageNet같은 일반이미지도 괜찮은 듯?
- Evaluating Large Language Models Trained on Code
- 장안의 화제 github copilot by MS + OpenAI
- 그 Copilot에 적용된 code생성에 finetune된 GPT3-12B짜리 모델 Codex!
- Human evaluation 뿐 아니라 코드생성에 필요한 매우 다양한 관점의 실험/고려 등등
- 그런데.. 학습에 쓰인 소스코드 라이센스 제대로 확인 안했다는 설도....
- A Survey on Data Augmentation for Text Classification
- Text classification 을 위한 data augmentation 연구들에 대한 survey paper
- 비교적 최근 연구들까지 잘 정리된 느낌 (GPT-3를 적극 쓰는 쪽 까진 아니고)
- 교수님들 NLP 수업 자료 만드실 때 유용하실 듯?
- Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling
- E2E 대화 AI 서비스나 출시 전에 safety와 관련된 여러이슈를 체크하는 프로토콜 정리 연구 (from Facebook)
- 각종 악성 활용에 대해서 어떻게 사전에 점검하고 평가할 것인지가 제법 잘 정리된 것으로 보임
- 대화 AI를 사용서비스로 준비하는 기업들에서는 꼭! 읽어봐야할 것 같고 대화모델 연구하시는 분들에게도 유용할 듯
- Kosp2e: Korean Speech to English Translation Corpus
- 서울대와 네이버 파파고 공동 연구로 만든 한영 음성-텍스트 번역 데이터셋
- 발음은 KOSPI 로~~ (코스피 5천 가즈아!)
- 제대로 된 소개 시간은 별도로 진행 예정이라고 합니다~
- https://github.com/warnikchow/kosp2e
- Edinburgh Pig Behavior Video Dataset
- U of Edinburgh 에서 만든 돼지 행동 비디오 데이터 (23일간)
- BBox, tracking identifier, 8 마리 돼지 7200 labeled frames
- 농축산업 + AI
- 논문은: https://homepages.inf.ed.ac.uk/rbf/PAPERS/pigsbehaviouranalysis_visapp2021.pdf

Kyung-Min commented 3 years ago

Paper

Pre-trained Language Model for Web-scale Retrieval in Baidu Search
Pre-trained Language Model based Ranking in Baidu Search
- Baidu에서 이번 KDD에 공개한 검색모델, 현재 서비스에 사용 중이라고 밝힘
- retrieval & ranking stage에 모두 적용됨
- text matching과 같은 기존 retrieval 모델 외에 6-layers Transformers (ERNIE)를 사용하여 query2doc 수행 (hybrid 모델) (retrieval & ranking 모두 ERNIE 사용)
- ERNIE는 2019년 Baidu에서 공개한 모델, chinese NLP Task에서 Google의 BERT를 이겼다고 홍보했었음
- 서비스를 위해 document는 embedding을 구워놓고 ANN을 통해 retrieval, post-processing으로 matching score, ctr 등을 피처삼아 한 번 더 score 산정 (rankSVM 등 사용)

News

AI song contest
- AI연구자, 음악가 등으로 구성된 팀 M.O.G.I.I.7.E.D.에서 SampleRNN(음성), GPT-2 (작사), Melody-RNN (멜로디)를 사용하여 만든 Listen To Your Body Choir가 우승
- human-AI co-creation이 목적이라 human voice도 존재
- 우리나라 팀에서 만든 Han:한이란 노래도 있는데 Melody RNN, MusicVAE, improvRNN로 제작
- 또한 Rubato Lab에서 Daybreak이란 노래 제작 (Music GPT2 to create bass lines and LSTM VAE to create drum beats)

ghlee3401 commented 3 years ago

Paper

DeepRapper: Neural Rap Generation with Rhyme and Rhythm Modeling
- Accepted by ACL 2021 main conference
- Problem : 이전 rap generation에서는 리듬을 고려하지 않고 라임이 있는 가사만 고려하였음
- Contribution : 라임과 리듬을 조절할 수 있는 Transformer 기반의 rap generation system을 제안
- Method
  1. Rap Dataset Mining : 가사, 비트(beat)가 포함된 rap dataset을 구축
  2. Rap Generation Model : transformer 기반의 autoregressive model을 기반으로 1) 라임 단어가 문장에 끝에 항상 위치하므로 문장을 오른쪽에서 왼쪽으로 생성 2) 비트(beat) 토큰을 사용 3) positional embedding 뿐만 아니라 추가적인 embedding도 사용
AdaSpeech 3: Adaptive Text to Speech for Spontaneous Style
- Sample : https://speechresearch.github.io/adaspeech3/
- Accepted by INTERSPEECH 2021
- Problem : 이전 TTS들은 낭독체에 대해서는 모델링을 잘하지만, 자유 발화 (spontaneous speech)에 대해서는 아직 부족
  1. 자유 발화 (spontaneous speech) 데이터 셋이 부족
  2. um, uh와 같은 음성과 자유 발화의 다양한 리듬을 모델링하는 것이 어려움
- Contribution : 낭독체 DB로 잘 학습된 TTS 모델을 fine-tunning하여 자유 발화 스타일을 갖도록 만듦
- Method
  1. text sequence에 filled pauses(FP) (uh, um과 같은)를 삽입, TTS model에서 예측기로 이를 예측
  2. 다양한 리듬을 학습하기 위해서 duration predictor는 fast, medium, slow speech를 예측
  3. 다른 화자의 음색을 adaptation하기 위해서 few data로 decoder를 fine-tunne

nick-jhlee commented 3 years ago

Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks
- University of Maryland (Prof. Tom Goldstein 참여), with high probability under review at NeurIPS2021
- AI의 시작으로 돌아가자! 사람은 어떻게 생각할까??
- 주어진 문제를 "mind"에 represent하고
- iterative transformation until it's solved! (~ RNN)
- 여기서 iteration은 "repetition of the recurrent residual block"의 횟수!
- 쉬운 문제들에서 RNN을 training한 후, 어려운 문제에서 생각을 더 많이 (i.e. test time에서 more iter)하게 해주면, 마치 algorithm을 배우는듯한 효과를 보인다! (~ extrapolation)
Rissanen Data Analysis: Examining Dataset Characteristics via Description Length
- NYU, FAIR + cho교수님 (ICML 2021)
- (이거 어려워 보여서 일단 넣기만 했어요,,,)
Comparing Test Sets with Item Response Theory
- Amazon, NYU, Allen Institute for AI + cho교수님 (ACL 2021)
- NLP dataset들이 너무 많다...! ==> 몇 개는 large pretrained model들 사이에 차이가 거의 없다는게 실험적으로 보여짐,,,
- Methodology: Item Response Theory (IRT)
- model이 test example을 맞출 확률: model의 latent var, example difficulty, discrimination ability, guessing
- variational inference로 fitting함
- 29개의 NLP datasets (classification, multiple-choice QA, span-selection QA)에 대한 결과들...
Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization
- NYU, Salesforce Research, Universite de Montreal + cho교수님 (ICML 2021)
- DNN의 초기 training 단계에 대한 theoretical explanation!
- 기존의 여러 work들은 마지막 단계에 대한 분석을 함! ([](), []())
- "implicit regularization effects due to using a large learning rate can be explained by its impact on the trace of the FIM (Tr(F)), a quantity that reflects the local curvature, from the beginning of training." (FIM: Fisher Information Matrix)
- 실험적으로 이걸 증명함:
- Tr(F)를 explicitly penalize하니 generalization이 많이 좋아짐
- Poor generalization이 일어날때 초기에 Tr(F)가 큰 값을 attain하는걸 목격(?)함 ==> "catastrophic Fisher explosion"
- Tr(F)를 explicitly penalize하니, memorization의 효과가 많이 없어짐 i.e. noisy label의 learning speed 감소 compared to clean label
[Improved Regret Bounds of Bilinear Bandits using Action Space Analysis]()
- KAIST, University of Arizona + Yun교수님 (ICML 2021)
- 2 개의 다른 action space에서 arm을 골르는 bilinear bandit을 고려!
- Optimal regret rate에 대한 conjecture를 disprove함! (더 좋은 bound)
- Intuition: bilinear bandit을 다르게 써서 rank-1 matrix manifold와의 linear bandit으로 해석!
- 이 lower bound를 achieve하는 algorithm (eps-FALB)를 devise함
- practical 하게 쓰일 수 있는 rO-UCB도 devise함 (<- 실험적으로 훨씬 더 좋은 성능)

nick-jhlee commented 3 years ago

SSBSE 2021 결과 나왔어요...! (for SE people)

jnhwkim commented 3 years ago

@nick-jhlee NeurIPS 2021 말씀이시죠?

nick-jhlee commented 3 years ago

@nick-jhlee NeurIPS 2021 말씀이시죠?

(오 맞네용,, 감사합니다 ㅎㅎㅎㅎ)

hollobit commented 3 years ago

정필모 의원 '인공지능법 제정안', 포털·의료 AI 판단 설명 의무화 http://www.aitimes.com/news/articleView.html?idxno=139381

인공지능 육성 및 신뢰 기반 조성 등에 관한 법률안 - 정필모의원등23인, 제2111261호(2021. 7. 1.). 제388회 국회(임시회) https://opinion.lawmaking.go.kr/gcom/nsmLmSts/out/2111261/detailRP
인공지능 관련 의안 현황 https://opinion.lawmaking.go.kr/gcom/nsmLmSts/out?sortCol=&sortOrder=&sugCd=21&sgtCls=&cptOfiOrgCd=&searchStDtNew=&searchEdDtNew=&rslRsltNmL=&rslRsltNmR=&scCptPpostCmt=&scPpsUsr=&stDtFmt=&edDtFmt=&scBlNm=scBlNm_blNm&scBlNmSct=%EC%9D%B8%EA%B3%B5%EC%A7%80%EB%8A%A5

AI천왕 한자리에! 요슈아 벤지오·제프리 힌튼·얀 르쿤, 공동 논문 발표 http://www.aitimes.com/news/articleView.html?idxno=139428&fbclid=IwAR3NHTmj-1x5wLRxHf9b88kw3g86kXEBDUV-sqjHsf8CsUVFmxwqk9fQERI

What OpenAI and GitHub’s ‘AI pair programmer’ means for the software industry

https://venturebeat.com/2021/07/06/what-openai-and-githubs-ai-pair-programmer-means-for-the-software-industry/

Google’s Supermodel: DeepMind Perceiver is a step on the road to an AI machine that could process anything and everything https://www.zdnet.com/article/googles-supermodel-deepmind-perceiver-is-a-step-on-the-road-to-an-ai-machine-that-could-process-everything/

Perceiver: General Perception with Iterative Attention. https://arxiv.org/abs/2103.03206

Google Open-Sources Token-Free Language Model ByT5

https://www.infoq.com/news/2021/07/google-byt5-nlp/

AI voice actors sound more human than ever—and they’re ready to hire

https://www.technologyreview.com/2021/07/09/1028140/ai-voice-actors-sound-human/

How a New AI Mindset for AutoML Will Make Deep Learning More Accessible

https://insidebigdata.com/2021/07/08/how-a-new-ai-mindset-for-automl-will-make-deep-learning-more-accessible/

nick-jhlee commented 3 years ago

Deep Learning for AI

3대장.. (Bengio, Lecun, Hinton)의 Turing Lecture! @ Communications of the ACM
비디오 보시는거 추천드려요! 밑엔 summary of the video (@hollobit <- 이 분의 설명이 훨씬 더 insightful 할거에요 ㅎ,,)
Purpose of this paper: right direction for making progress in AI? (LeCun옹)
- learn more like humans and animals
- learn to reason
- perceive in a robust way
Symbolic AI vs Neural nets (Hinton옹)
- Symbolic AI: internal representation is symbolic expr, how to reason with them?
- NN: internal representation is a big vector of neurons how to learn? how to rewire neurons?
Weakness of SOTA (Bengio옹)
- generalization in the novel situation...
- (attention...etc.)
Deep Learning is okay for system 1, but not system 2.... (LeCun옹)
- "machine should learn model world that can predict what's gonna happen in the world as a consequence of its action"
How can neural network (of fixed architecture) model compositional structure?? (Hinton옹)
- We need to understand how neural networks do that without dynamically reallocating neurons.
- 사람이 compositional structure 모델링하는거랑 neural network가 하는거랑 다름...!
Is there problems in which deep learning fails? (Bengio옹)
- 사실 없다!! (~~All hail deep learning~~)
- Extend neural network to be more structured
- COMBINE symbolic AI and NN!
- symbolic AI: inductive biases
- NN: large scale, end-to-end, large scale
Deep learning will never die (Hinton옹)
- BUT, understanding how to make NN more effective...
- Radical new ideas needed!
Common sense for machines? (LeCun옹)
- COnnection of models of the worlds!
- How can machines acquire those models to do reasoning...?
- Can machines have human-level intelligence?

nick-jhlee commented 3 years ago

2021년 한국인공지능학회 하계학술대회 (07.08~07.09)

Sergey Levine & Stephen Boyd!
tutorials, sessions...etc.

veritas9872 commented 3 years ago

조~금 오래된 논문이지만 의료AI 논문 데뷰를 위해 공유해드립니다.

nnDetection: A Self-configuring Method for Medical Object Detection Arxiv: https://arxiv.org/abs/2106.00817 GitHub: https://github.com/MIC-DKFZ/nnDetection

Nature Methods에 publish된 nnUNet의 후속작 논문입니다.

Natural Image에서도 마찬가지이지만 Medical Imaging에서는 작은 데이터셋에서 modality마다 특성이 매우 다르고 3D 데이터에서 각 이미지마다 pixel의 distance가 다르다는 특성이 있습니다. 그래서 기존에서 각 modality, dataset마다 데이터의 특성에 맞게 많은 custom engineering 및 노하우가 필요했었습니다. 그러나 nn (no-new) UNet은 UNet으로 모델 구조를 통일하고 학습을 위해 data parameter (fingerprint라고 부릅니다)를 맞추어 학습이 잘되도록 하고 실제로 parameter를 그대로 둔 채 여러 challenge에서 상위권 성적을 거두었습니다. 기존의 nnUNet은 segmentation을 위한 것이었지만 medical imaging에서는 detection이 segmentation만큼 혹은 그 이상 중요한 경우가 있는데 (예를 들어 Lung에서 nodule을 찾는 것처럼) 이를 위해 detection에 특화된 framework를 제시했습니다.

Clyde21c commented 3 years ago

RMA: Rapid Motor Adaptation for Legged Robots
- FAIR & BAIR & CMU의 4족보행 RL연구 (RSS 2021)
- sim-to-real 문제가 실제로 쓸만하려면 1초 미만 fast adaptation 필요
- 시뮬레이션에서 RL+adaptation module (system identification역할) 을 학습하여 fine-tuning없이 바로 실제 로봇에 deploy해도 처음 보는 환경에서 나쁘지 않은 성능 보여줌
- Env Randomization + context-based Meta RL
- https://fb.watch/v/3k6-z2ppT/
MedGPT: Medical Concept Prediction from Clinical Narratives
- GPT2 사용 EHR데이터 기반 질병 가능성 예측
- King’s College Hospital의 1999년부터의 60만명 환자의 문서 사용
- 예측 근거 분석