Open unizard opened 7 years ago
Weakly-supervised learning of visual relations, action 이 아니다?
Vision - Language 의 관계를 보는 것이다.
Related Work
Devise: A Deep Visual-Semantic Embedding Model, NIPs2013
https://static.googleusercontent.com/media/research.google.com/ko//pubs/archive/41869.pdf http://vision.cs.utexas.edu/381V-fall2016/slides/nagarajan-paper.pdf http://www.cs.virginia.edu/~vicente/vislang/slides/devise.pdf http://videolectures.net/nipsworkshops2013_bengio_embedding_model/
https://github.com/gtlim/DeVise https://github.com/akshaychawla/devise-keras
Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models, TACL 2015.
https://arxiv.org/abs/1411.2539 http://deeplearning.cs.toronto.edu/i2t https://github.com/ryankiros/visual-semantic-embedding
Large-scale object classification using label relation graphs, ECCV2014
http://web.eecs.umich.edu/~jiadeng/paper/deng2014large.pdf https://github.com/ronghanghu/hex_graph https://docs.google.com/presentation/d/1Gl60i1dyZ8iyrNbyPKUfeXkOL2pwe8i0hY8Bn4E-FNw/edit#slide=id.g47ac826c7_0179 https://www.slideshare.net/takmin/ieee2014
우리는 아주 오랫동안 영상 이해를 위해
대상과 대상의 관계(위치관계/상호작용여부/)를 대상의 장소를 대상의 특정 행동을 적극 활용해 왔다. 지금 어느 단계 까지 와 있을까? 나는 Multiframe을 통한 연구가 아닌, single image 기반의 이해를 원한다.
Captioning 연구 VQA 연구
Weakly supervised learning of visual relations, 2017/07/CoRR https://arxiv.org/pdf/1707.09472.pdf
[IntractNet]Detecting and Recognizing Human-Object Interactions, CoRR. Multi-task learning: Action classification + Localization
https://arxiv.org/abs/1704.07333 https://gkioxari.github.io/InteractNet/index.html https://www.facebook.com/groups/TensorFlowKR/permalink/462002037474193/
Detecting Visual Relationships with Deep Relational Networks, CVPR2017
https://arxiv.org/abs/1704.03114 https://www.youtube.com/watch?v=ffpy3BqN88o https://github.com/doubledaibo/drnet_cvpr2017
Towards Context-aware Interaction Recognition for Visual Relationship Detection, ICCV2017 Care about you: towards large-scale human-centric visual relationship detection, CoRR2017
https://arxiv.org/pdf/1703.06246.pdf https://arxiv.org/pdf/1705.09892.pdf
Beyond Instance-Level Image Retrieval: Leveraging Captions to Learn a Global Visual Representation for Semantic Retrieval, CVPR2017.
https://norman3.github.io/papers/docs/deepimgir <- 네이버 비전팀 홍기호님자료
CVPR2015
Institute: Stanford, Google URL: paper, Supp Keywords: Interest: 3