memect / hao

好东西传送门
1.4k stars 461 forks source link

@月光馆果果妈 请问从语音和视频资源共同进行学习的算法谁比较有研究啊,麻烦了。 #115

Closed haoawesome closed 9 years ago

haoawesome commented 10 years ago

请问从语音和视频资源共同进行学习的算法谁比较有研究啊,麻烦了。 私信

haoawesome commented 10 years ago

concept

keywords: Multimodal learning

http://en.wikipedia.org/wiki/Multimodality

people

http://people.ict.usc.edu/~morency/ Louis-Philippe Morency

http://research.microsoft.com/en-us/people/deng/ Li Deng

Honglak Lee

papers

http://ai.stanford.edu/~ang/papers/nipsdlufl10-MultimodalDeepLearning.pdf Multimodal Deep Learning, NIPS'10

http://papers.nips.cc/paper/4683-multimodal-learning-with-deep-boltzmann-machines.pdf Multimodal Learning with Deep Boltzmann Machines NIPS'12

http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6638346&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D6638346 Deep learning for robust feature generation in audiovisual emotion recognition

http://ict.usc.edu/pubs/Utterance-Level%20Multimodal%20Sentiment%20Analysis.pdf Utterance-Level Multimodal Sentiment Analysis ACL 2013

http://www.csri.utoronto.ca/~hinton/absps/overview13.pdf New types of deep neural network learning for speech recognition and related applications: An overview, L Deng, G Hinton, B Kingsbury

http://www.sciencedirect.com/science/article/pii/S0262885614001036 A review of recent advances in visual speech decoding 2014

http://jmlr.org/proceedings/papers/v32/kiros14.pdf Multimodal Neural Language Models, ICML2014

related

http://www.davidlazeargroup.com/free_articles/multi-modal.html

http://dl.acm.org/event.cfm?id=RE354 ICMI-MLMIMultimodal Interfaces and Machine Learning for Multimodal Interaction

http://www.cs.indiana.edu/~kduan/papers/cvpr14-flickr.pdf Multimodal Learning in Loosely-organized Web Images CVPR 2014

http://vincent.vanhoucke.com/publications/vanhoucke-iclr14.pdf?attredirects=0 Learning Visual Representations at Scale

haoawesome commented 10 years ago

清华自然语言处理实验室 :ICML2014看点:Multimodal Neural Language Models。基于神经网络的语言模型近年来备受追捧,而多模态学习也是近年来的热点(ACL2013最佳论文即为此方向研究),这篇文章同时占了这两个吸引眼球的主题。 #ICML2014Keywords# http://t.cn/RvjhU2x http://weibo.com/2108671043/Babv5FI4k

http://jmlr.org/proceedings/papers/v32/kiros14.pdf

haoawesome commented 10 years ago

言语挖挖 :USC ICT在虚拟人,对话系统方面很强的。他这是选地方干事情。在这里搞出来MultiSense 一个很牛的multimodal sensing 系统,是virtual human 的感官子系统。有这种神器,以后的资助,项目还不是简单。

◆◆

@王威廉 :卡内基梅隆大学LTI系主任在国庆节公开了一条新消息: 多模态交互领域著名专家、法国裔加拿大籍科学家LP Morency正式拒绝了MIT提供的教职,将于2015年加入我系。LP比较著名的工作是与Mike Collins等人合作的hidden conditional random fields,及其相关模型在多模态识别上的应用。

http://weibo.com/2054826481/Bcd23uEiI

haoawesome commented 10 years ago

城墙下的车夫 :以前做face recognition研究多年的一位MIT毕业的年轻教授最近告知他们都懒得再搞自己的了,直接集成这家的SDK。大家查查最近NLP on Video的文章,不少用了他们的CERT抓捕面部表情。比如去年ACL一篇 Multimodal Sentisent Detection

◆◆

@36氪 微博机构认证:Emotient 起源于加利福尼亚大学的“机器感知实验室”,他们最终的目的是打造一套“无所不在”的人类情感分析系统。Emotient 利用摄像头来捕捉、记录面部肌肉运动,并根据其计算模型来分析出面部表情,最终得出关于表情的动态结果。http://t.cn/8sP0FiU |Emotient http://weibo.com/1833911793/AAvsdlV1J

haoawesome commented 10 years ago

杨逍Venus 微博达人:Google Deep Learning Infrastructure team的Tech Lead Vincent Vanhoucke在ICLR14上给了一个talk:Learning Visual Representations at Scale http://t.cn/8sOBDS6 。他开玩笑说“Speech recognition is pretty much solved, so I work on vision now" 。另外印证了@韧在百度 老师之前的猜测也很准。 http://weibo.com/1668647380/B076CqTdh

haoawesome commented 10 years ago

问:@月光馆果果妈 请问从语音和视频资源共同进行学习的算法谁比较有研究啊 答: 深度学习是当前趋势(斯坦福,微软,谷歌都这样)。专家 Andrew Ng, Geoffrey Hinton, Li Deng, Louis-Philippe Morency, Ruslan Salakhutdinov, 微博技术控:@言语挖挖 欢迎补充指正

http://www.weibo.com/5220650532/BkdhGpY4d?ref=

haoawesome commented 10 years ago

放个好玩的demo http://lav.io/2014/06/videogrep-automatic-supercuts-with-python/

源代码: https://github.com/antiboredom/videogrep