jungwoo-ha commented 2 years ago

AI News
- 페이스북, 뉴욕대 연구 계정 차단…데이터 연구활용 놓고 갈등
- DEVIEW 2021 연사 모집
- 8월 25일까지
- 이번에도 온라인으로 합니다. (11월 예상)
- Hands-on 도 대환영 입니다.
ArXiv
- Perceiver IO: A General Architecture for Structured Inputs & Outputs
- 다양한 modality를 하나의 customized self-attention 기반 architecture 활용 학습하는 Perceiver (ICML 2021)
- 기존은 encoder에 간단한 classification 이나 score만 뱉을 수 있었음
- 이번엔 decoding 부분을 강화하여 input 은 물론 output 도 크기에 linear scale-up 되도록 모델 구성
- 훨씬 더 복잡하고 고차원의 벡터로 표현되는 output task에도 적용
- Deepmind의 AGI를 향한 의지가 보이는 연구
- 코드: https://github.com/deepmind/deepmind-research/tree/master/perceiver
- Neural Scene Decoration from a Single Photograph
- 실내 인터레이어나 데코레이션을 생성하는 연구 (from HKUST)
- Empty scene + point-level object labels --> Indoor scene generation.
- SPADE, BatchGAN customizing 버전과 비교
- Transfer Learning for Pose Estimation of Illustrated Characters
- 애니메이션이나 일러스트레이션 사람의 pose estimation (from UMCP)
- Mask RCNN 기반의 pose estimator + ResNet50+단부루 tagger second pretraining + 추가 모듈
- Sketch Your Own GAN
  - Pretrained GAN + user image sketch --> GAN을 customizing 후 이미지 생성 (from CMU Jun-Yan Zhu)
  - Pretraine GAN + Pretrained photo2sketch
  - Project page: https://peterwang512.github.io/GANSketching/
  - Github: https://github.com/peterwang512/GANSketching
- The AI Economist: Optimal Economic Policy Design via Two-level Deep Reinforcement Learning
- 경제 정책이나 경제 동작원리설계 등 에 RL을? (from SalesForce, You.com, R. Socher)
- Two-level RL을 활용한 economic simulation
- A Low Rank Promoting Prior for Unsupervised Contrastive Learning
- Nuclear norm의 low rank prior로 활용하는 MoCo 개선 버전
- 코드구현체는 몹시 간단함. (Nuclear norm을 매번 계산해야 함)
- SwAV와 같이 Multi-view + multi-crop 적극활용
- 200 epoch + 256 batch 로 BYOL과 동일한 성능 (ImageNet-1k, ResNet-50 기준)
- SwAV 처럼 Forward pass는 많이 써야함. 큰모델 Uncurated 더 큰 데이터 working 여부 궁금?

ghlee3401 commented 2 years ago

Arxiv

Musical Speech: A Transformer-based Composition Tool
- Sample : https://youtu.be/IjTnt_MP86M
- Abstract : 녹음 음성 혹은 작곡을 위한 musical block
- Contribution
  1. parallel dataset 없이 speech를 music으로 conversion 할 수 있음
  2. denoising and gap-filling을 위한 musical composition task에 새로운 set을 정의
  3. 누구나 온라인으로 사용할 수 있는 tool
Sinsy: A Deep Neural Network-Based Singing Voice Synthesis System
- Sample : https://ubisoft-laforge.github.io/speech/daft-exprt/
- Conference : IEEE/ACM Transactions on Audio, Speech and Language Processing
- Method
  1. DNN-based SVS, PeriodNet (neural vocoder)
  2. 네 개의 모듈을 사용 1) time-lag model 2) duration model 3) acoustic model 4) vocoder
  3. Time-lag 모델은 note의 phoneme 대신에 HSMM으로 추출한 실제 phoneme timing을 배움
  4. acoustic 모델에서는 vibrato 개념을 사용
Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis
- sample : https://www.sp.nitech.ac.jp/~hono/demos/taslp2021/
- Abstract : Ubisoft에서 나온 논문으로 논문에서 제안하는 Deep affine transformations for Expressive prosody transfer (Draft-Expert)는 prosody transfer가 가능한 multi-speaker acoustic 모델
- Sample : https://ubisoft-laforge.github.io/speech/daft-exprt/
- Problem
  1. global prosodic representation은 세부 조절이 불가능하고 화자 정보와 운율 정보가 서로 얽혀 있어 prosody transfer에 방해가 됨
  2. local prosodic feature를 예측하는 모델은 일반적으로 concatenation-based conditioning을 이용하지만 conditional normalization 방법이 다양한 도메인에서 좋은 결과를 보여줌
- Method & Contribution
  1. FastSpeech2를 기반으로 함
  2. FiLM: Visual Reasoning with a General Conditioning Layer 을 사용하여 정교한 prosody transfer를 할 수 있음
  3. adversarial training이 latent prosodic attribute로부터 speaker identyti 정보를 disengangle할 수 있음을 보여줌
  4. highly expressive data를 이용한 inter-speaker and inter-text prosody transfer의 SOTA를 보여줌
- Related Work
  1. Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning : gradient reversal with speaker classifier
  2. AdaSpeech: Adaptive Text to Speech for Custom Voice : adaptation을 위하여 scale과 bias factor를 이용한 conditional layernorm을 이용
  3. FastSpeech 2: Fast and High-Quality End-to-End Text to Speech : duration을 외부에서 구한 ground truth를 사용
  4. Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling : duration을 예측

hollobit commented 2 years ago

Bipedal robot from Oregon State University completes 5-km run using machine learning

https://www.ctvnews.ca/sci-tech/bipedal-robot-from-oregon-state-university-completes-5-km-run-using-machine-learning-1.5538770

Make Every feature Binary: A 135B parameter sparse neural network for massively improved search relevance

https://www.microsoft.com/en-us/research/blog/make-every-feature-binary-a-135b-parameter-sparse-neural-network-for-massively-improved-search-relevance/
상위 검색 결과의 클릭률(CTR) 이 거의 2% 증가
수동 쿼리 재구성이 1% 이상 감소
"다음 페이지" 클릭이 1.5% 이상 감소

The Open-Source Movement Comes to Medical Datasets

https://hai.stanford.edu/news/open-source-movement-comes-medical-datasets
https://stanfordaimi.azurewebsites.net/
Microsoft의 AI for Health 프로그램과 협력하여 이러한 이미지를 호스팅하고 더 큰 무료 글로벌 리포지토리
100만 개 이상의 이미지에 대해 주석이 달린 데이터 세트를 축적, 내년까지 200만개 이상 이미지

China’s lonely hearts reboot online romance with artificial intelligence

https://www.washingtonpost.com/world/2021/08/06/china-online-dating-love-replika/

인공지능 기술까지...중국서 가짜 SNS계정으로 친중 활동 하는 법

https://www.bbc.com/korean/international-58085349

AI가 쓴 소설, 읽을 준비 되셨나요 ?

https://m.hani.co.kr/arti/culture/book/1006669.html#cb

Nvidia is tracking more than 8,500 AI startups with $60B in funding

https://venturebeat.com/2021/08/02/nvidia-is-tracking-more-than-8500-ai-startups-with-60b-in-funding/

Pentagon is using artificial intelligence to predict the future and give it 'days of advanced warning' on attacks on sensitive sites like the Panama Canal

https://www.dailymail.co.uk/sciencetech/article-9852653/Pentagon-using-AI-predict-future-days-advanced-warning-attacks-sensitive-sites.html

Economist-less economics: The future of economics in an AI-biased world

https://www.weforum.org/agenda/2021/08/economist-less-economics-the-future-of-economics-in-an-ai-biased-world/

nick-jhlee commented 2 years ago

내일부터 3일간 (08/09~08/11) 서울대학교에서 AI 여름학교를 엽니다!

여러 흥미로운 주제 (theoretical ML/DL, privacy, fairness, NLP, generative model, data augmentation... 등등)들에 관하여 여러 교수님들 (SNU, Google, Stanford..etc.)이 강연하시니, 원하시는 시간대에 들어가서 보면 좋을 것 같습니다! 참고로 registration 없고, Zoom/Youtube 링크가 열려있어서, 원하실때 들어갔다가 나오는게 가능한 것 같습니당

http://aiis.snu.ac.kr/aisummerschool2021/

veritas9872 commented 2 years ago

Don't Sweep Your Learning Rate under the Rug: A Closer Look at Cross-modal Transfer of Pretrained Transformers Arxiv: https://arxiv.org/abs/2107.12460

Screenshot (29)

ICML 2021 workshop paper로 accept된 Facebook의 반박 논문입니다.

기존의 Pretrained Transformers as Universal Computation Engines 논문에서 pre-train된 transformer는 stem과 head를 제외하고 freezing을 했을 때 fine-tuning을 할 경우에도 전체 fine-tuning을 한 것과 유사한 성능을 낼 수 있다는 논문이 화제가 되었는데 알고보니 hyper-parameter tuning, 특히 learning rate를 조금만 다르게 잡으면 전혀 그렇지 않는다는 (김빠지는) 내용의 논문입니다.

물론 새로운 발견을 하는 것만큼 신나는 논문이 될 수는 없겠지만 이러한 논문도 중요하다고 생각합니다.

A Realistic Simulation Framework for Learning with Label Noise

DeepMind에서 나온 논문으로 현재 리뷰 중입니다.

Arxiv: https://arxiv.org/abs/2107.11413 GitHub: https://github.com/deepmind/deepmind-research/tree/master/noisy_label

Screenshot (30)

현재 label noise가 있을 경우에 robustness를 측정하고자 할 때에는 random noise, class specific noise 등을 사용하는 것이 일반적인데 현실에서는 instance specific difficulty가 다릅니다. 이때, 여러 모델을 다른 hyper-parameter setting에서 rater로 만들어 학습시킨 후 pseudo-label을 만들도록 해서 실제 difficulty와 유사하도록 만듦니다.

또한, 이러한 데이터셋으로부터 현재 robust learning 방법론들의 결과를 측정해보고 일반적인 random label noise에 비해서는 성능이 저하됨을 확인합니다.

중요한 발견은 1. class imbalance가 강한 환경에서 label noise가 더 강하다. 2. hard task보다 easy task에서 label noise에서의 악영향이 더 강하다. 2번이 예상 외인데 저자 설명보다 심도 있는 설명이 필요할 것 같습니다.

Kyung-Min commented 2 years ago

Paper

Open-Ended Learning Leads to Generally Capable Agents
- 딥마인드의 AGI를 향한 또 다른 도전
- Multi-task learning을 self-playing을 통해 푸는 RL을 만들고자 XLand라는 game environment 제작
- XLand는 Game (rule) + World (map) + co-players를 Task로 정의하고 PBT를 통해 생성
- 700,000 unique games in 4,000 unique worlds, each agent experienced with 200B training steps in the final generation as a result of 3.4 M unique tasks
- hide and seek, capture the flag과 같은 복잡한 문제도 풀 수 있더라 (학습데이터에는 존재 x)
- 영상
LARGE-SCALE GRAPH REPRESENTATION LEARNING WITH VERY DEEP GNNS AND SELF-SUPERVISION
- 현재 GNN의 문제인 deep layer를 못쌓는 현상을 극복, 50층까지 쌓고 OGB-LSC에서 top-3의 성적을 올림
- OGB-LSC 데이터셋은 millions 단위 노드, billions 단위 엣지를 제공 (기존 public dataset은 보통 크기가 작음, K 단위)
- 논문에서는 encoder-decoder 방식의 self-supervised learning 사용
- GNN 쪽 연구는 scaling이 좀 더 커져야 의미있는 결과가 나올 수 있을 것, (데이터가 작으면 oversmoothing 문제 발생)

hollobit commented 2 years ago

대한의료인공지능학회 여름 학교 8/13 ~ 8/14 https://www.kosaim.org/html/?pmode=BBBS0007100001&smode=view&seq=91

nick-jhlee commented 2 years ago

Weisfeiler and Lehman Go Topological: Message Passing Simplicial Network (ICML 2021 Spotlight)
- Imperial College London, Twitter, UCLA, Cambridge, Shanghai Jiao Tong University (Prof. Michael Bronstein 참여)
- 현재 GNN paradigm은 message passing에 기반함 i.e. only pairwise interaction!
- 하지만 실제 데이터에선 pairwise 이상의 interaction을 보통 가짐... ==> simplicial complex로 확장!
- Theoretical analysis
- MPNN <= Weisfeiler-Lehman testing (WL test)
- Message Passing Simplicial Network (MPSN) <= Simplicial WL-test (<-- 여기서 새로 제안함!)
- WL test < MPSN, SWL <= 3-WL test
- Empirical analysis
- strongly regular graphs에서 outperform함! (MPNN은 여기서 fail...)
- 주로 쓰이는 GNN benchmark 에서 at par/superior performance를 보임 (특히 IMDB처럼 triangle과 같이 higher-order structure가 많을 때 performance improvement가 가장 높았음)
- edge-flow classification에서도 상당히 좋은 성능을 보임...! (synthetic: 밑에 그림, real: ocean drifter trajectories around the island of Madagascar between years 2011-2018.)

Weisfeiler and Lehman Go Cellular: CW Networks (under review for NeurIPS 2021?)
- Imperial College London, Twitter, UCLA, Cambridge, Shanghai Jiao Tong University (Prof. Michael Bronstein 참여)
- 위의 확장판 (~~다 해먹고 있네요,,~~)
- algebraic topology (대수위상)에서 CW network랑 "lifting operation"을 가져와서 확장함
- Theoretical analysis
- CW Networks (CWN) <= Cellular WL-test (<-- 여기서 새로 제안함!)
- WL test < CWN, CWL <= 3-WL test
- Empirical analysis
- synthetic datasets (CSL, SR, RingTransfer)에서 SOTA를 찍음
- molecular datasets (ZINC, ZINC-FULL, MOLHIV)에서 SOTA를 찍음
GRAND: Graph Neural Diffusion (ICML 2021 Spotlight)
- Twitter, Imperial College London, IDSIA/USI (요기도 Prof. Michael Bronstein)
- GNN을 Heat diffusion PDE의 discretization처럼 생각!
- Discretization의 형태에 따라 하나의 큰 family of GNN architecture가 나옴!
- Neural ODE의 GNN 버젼 (<- 저자 피셜)
- 요렇게 하니까 여러 좋은 점이 있더라 (empirically):
- depth, oversmoothing, bottlenecks를 avoid함!
- stability with respect to perturbations in the data
- competitive/superior results on many standard graph benchmarks.
- Theoretical 부분에선, 고려된 numerical scheme의 stability analysis까지 제공!
- 관심 있으신 분들은 https://towardsdatascience.com/graph-neural-networks-as-neural-diffusion-pdes-8571b8c0c774 요 블로그 참고하면 좋아요...! (저자들 직강)
Loss Surface Simplexes for Mode Connecting Volumes and Fast Ensembling (ICML 2021 Spotlight)
- NYU (Prof. Andrew Gordon Wilson 참여)
- 예ㅔㅔ전 NeurIPS 2018 논문 Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs의 확장판
- 알고 보니 deep learning architecture의 minima들이 고차원의 어떤 simplex에 분포되어있다? (밑에 그림 참조)
- 이러면 model ensembling할 때, 상당히 좋은 성능을 빠르게 얻을 수 있음! (Bayesian분들 참고 ㅎㅎ)
- "Inspired by this discovery, we show how to efficiently build simplicial complexes for fast ensembling, outperforming independently trained deep ensembles in accuracy, calibration, and robustness to dataset shift. Notably, our approach only requires a few training epochs to discover a low-loss simplex, starting from a pre-trained solution."
- 코드 공개됨 (AGW 랩의 원칙)

nick-jhlee commented 2 years ago

이번에 시간이 없어서 일단 적기만 하는 논문들 (다 ICML 2021 oral입니당):

jnhwkim commented 2 years ago

클럽하우스에서 언급한 내용에 대한 논문을 찾아서 공유드립니다.

Attention is not all you need: pure attention loses rank doubly exponentially with depth (ICML 2021 Long Talk)
이전 weekly에서도 공유되었네요!
바로 위에도 있군요..!!

jungwoo-ha / WeeklyArxivTalk

[20210808] Weekly AI ArXiv 만담 #20

Arxiv