@月光馆果果妈请问从语音和视频资源共同进行学习的算法谁比较有研究啊，麻烦了。

haoawesome commented 10 years ago

请问从语音和视频资源共同进行学习的算法谁比较有研究啊，麻烦了。私信

haoawesome commented 10 years ago

concept

keywords: Multimodal learning

http://en.wikipedia.org/wiki/Multimodality

people

http://people.ict.usc.edu/~morency/ Louis-Philippe Morency

http://research.microsoft.com/en-us/people/deng/ Li Deng

Honglak Lee

papers

http://ai.stanford.edu/~ang/papers/nipsdlufl10-MultimodalDeepLearning.pdf Multimodal Deep Learning, NIPS'10

http://papers.nips.cc/paper/4683-multimodal-learning-with-deep-boltzmann-machines.pdf Multimodal Learning with Deep Boltzmann Machines NIPS'12

recommend by @清华自然语言处理实验室

http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6638346&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D6638346 Deep learning for robust feature generation in audiovisual emotion recognition

http://ict.usc.edu/pubs/Utterance-Level%20Multimodal%20Sentiment%20Analysis.pdf Utterance-Level Multimodal Sentiment Analysis ACL 2013

Louis-Philippe Morency
recommend by @城墙下的车夫 @言语挖挖

http://www.csri.utoronto.ca/~hinton/absps/overview13.pdf New types of deep neural network learning for speech recognition and related applications: An overview, L Deng, G Hinton, B Kingsbury

http://www.sciencedirect.com/science/article/pii/S0262885614001036 A review of recent advances in visual speech decoding 2014

http://jmlr.org/proceedings/papers/v32/kiros14.pdf Multimodal Neural Language Models, ICML2014

清华自然语言处理实验室：ICML2014看点：Multimodal Neural Language Models。基于神经网络的语言模型近年来备受追捧，而多模态学习也是近年来的热点（ACL2013最佳论文即为此方向研究），这篇文章同时占了这两个吸引眼球的主题。 #ICML2014Keywords# http://t.cn/RvjhU2x http://weibo.com/2108671043/Babv5FI4k

http://jmlr.org/proceedings/papers/v32/kiros14.pdf

haoawesome commented 10 years ago

言语挖挖：USC ICT在虚拟人，对话系统方面很强的。他这是选地方干事情。在这里搞出来MultiSense 一个很牛的multimodal sensing 系统，是virtual human 的感官子系统。有这种神器，以后的资助，项目还不是简单。

◆◆

@王威廉：卡内基梅隆大学LTI系主任在国庆节公开了一条新消息: 多模态交互领域著名专家、法国裔加拿大籍科学家LP Morency正式拒绝了MIT提供的教职，将于2015年加入我系。LP比较著名的工作是与Mike Collins等人合作的hidden conditional random fields，及其相关模型在多模态识别上的应用。

http://weibo.com/2054826481/Bcd23uEiI

haoawesome commented 10 years ago

城墙下的车夫：以前做face recognition研究多年的一位MIT毕业的年轻教授最近告知他们都懒得再搞自己的了，直接集成这家的SDK。大家查查最近NLP on Video的文章，不少用了他们的CERT抓捕面部表情。比如去年ACL一篇 Multimodal Sentisent Detection

◆◆

@36氪微博机构认证：Emotient 起源于加利福尼亚大学的“机器感知实验室”，他们最终的目的是打造一套“无所不在”的人类情感分析系统。Emotient 利用摄像头来捕捉、记录面部肌肉运动，并根据其计算模型来分析出面部表情，最终得出关于表情的动态结果。http://t.cn/8sP0FiU |Emotient http://weibo.com/1833911793/AAvsdlV1J

haoawesome commented 10 years ago

杨逍Venus 微博达人：Google Deep Learning Infrastructure team的Tech Lead Vincent Vanhoucke在ICLR14上给了一个talk：Learning Visual Representations at Scale http://t.cn/8sOBDS6 。他开玩笑说“Speech recognition is pretty much solved, so I work on vision now" 。另外印证了@韧在百度老师之前的猜测也很准。 http://weibo.com/1668647380/B076CqTdh

haoawesome commented 10 years ago

问：@月光馆果果妈请问从语音和视频资源共同进行学习的算法谁比较有研究啊答：深度学习是当前趋势(斯坦福，微软，谷歌都这样）。专家 Andrew Ng, Geoffrey Hinton, Li Deng, Louis-Philippe Morency, Ruslan Salakhutdinov, 微博技术控：@言语挖挖欢迎补充指正

http://www.weibo.com/5220650532/BkdhGpY4d?ref=

haoawesome commented 10 years ago

放个好玩的demo http://lav.io/2014/06/videogrep-automatic-supercuts-with-python/

源代码： https://github.com/antiboredom/videogrep

memect / hao