memect / hao

好东西传送门
1.4k stars 462 forks source link

@vincent是正能量 @好东西传送门 hi,有没有synonym mining的survy paper,以及比较核心的一些paper。谢谢啦 #194

Closed haoawesome closed 9 years ago

haoawesome commented 9 years ago

http://www.weibo.com/2753836105/Bnhm4B9pO

haoawesome commented 9 years ago

概念

http://en.wikipedia.org/wiki/Synonym A synonym (also metonym and poecilonym) is a word with the same or similar meaning of another word. 同义词

http://en.wikipedia.org/wiki/Semantic_similarity Semantic similarity or semantic relatedness is a metric defined over a set of documents or terms, where the idea of distance between them is based on the likeness of their meaning or semantic content as opposed to similarity which can be estimated regarding their syntactical representation (e.g. their string format).

相关话题

https://github.com/memect/hao/issues/150 中文或者英文的同义词工具 BabelNet (集成了wordnet 和wikipedia)

the_babelnet_structure

https://github.com/memect/hao/blob/master/awesome/chinese-word-similarity.md 中文词汇的语义相似度计算工具

haoawesome commented 9 years ago

微软研究院 Synonym Mining 组

http://research.microsoft.com/en-us/projects/synonyms/ Synonym Mining

The same entity is often referred to in a variety of ways. For example, the camera Canon 600d is also referred to as "canon rebel t3i", the celebrity Jennifer Lopez is also referred to as "jlo" and Seattle Tacoma International Airport is also referred to as "sea tac". These are known as synonyms. Without knowledge of synonyms, many applications like e-commerce search will fail to return relevant results. We leverage the data assets amassed by Bing to automatically mine such synonyms.

https://datamarket.azure.com/dataset/bing/synonyms Synonyms API 2012

Bilyana Taneva, Tao Cheng, Kaushik Chakrabarti, and Yeye He, Mining Acronym Expansions and Their Meanings Using Query Click Log, WWW Conference 2013, May 2013

http://research.microsoft.com/pubs/167835/idg811-cheng.pdf Kaushik Chakrabarti, Surajit Chaudhuri, Tao Cheng, and Dong Xin, A Framework for Robust Discovery of Entity Synonyms, in SIGKDD, 2012 screen shot 2014-09-16 at 4 16 29 pm

Surajit Chaudhuri, Venkatesh Ganti, and Dong Xin, Exploiting Web Search To Generate Synonyms For Entities, in 18th International World Wide Web Conference, Association for Computing Machinery, Inc., April 2009

screen shot 2014-09-16 at 4 09 46 pm

haoawesome commented 9 years ago

http://www.extractor.com/turney-ecml2001.pdf Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL

haoawesome commented 9 years ago

http://en.wikipedia.org/wiki/Word-sense_disambiguation In computational linguistics, word-sense disambiguation (WSD) is an open problem of natural language processing and ontology, which governs the process of identifying which sense of a word (i.e. meaning) is used in a sentence, when the word has multiple meanings.

http://promethee.philo.ulg.ac.be/engdep1/download/bacIII/ACM_Survey_2009_Navigli.pdf Word sense disambiguation: A survey (acm computing survey 2009)

haoawesome commented 9 years ago

http://en.wikipedia.org/wiki/Entity_linking In natural language processing, entity linking, named entity disambiguation or named entity normalization (NEN)[1] is the task of determining the identity of entities mentioned in text. It is distinct from named entity recognition (NER) in that it identifies not the occurrence of names (and a limited classification of those), but their reference.

see https://github.com/memect/hao/issues/100

haoawesome commented 9 years ago

问:@vincent是正能量 有没有synonym mining的survy paper,以及比较核心的一些paper? 答: 问答资料 http://memect.co/WgvBUxV Wordnet synset 人工构造了同义词(synonym)集合, 自动方法通常依靠语义相关分析(semantic similarity) 微软有相关项目, 我们有技术资料整理贴 http://memect.co/iyzMdj4

haoawesome commented 9 years ago

http://lunadong.com/publication/fromDFtoKF_vldb.pdf Xin Luna Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Kevin Murphy, Shaohua Sun, and Wei Zhang. From data fusion to knowledge fusion. In VLDB, 2014. [PDF][Presentation] http://lunadong.com/talks/fromDFtoKF.pdf

haoawesome commented 9 years ago

康积华_绩点侠:richard socher有一篇12年的文章是使用神经网络来做这个,Improving Word Representations Via Global Context And Multiple Word Prototypes,顺势开始大量使用dl去做这些任务,可以去他主页上一看 (今天 08:03)

http://www.weibo.com/5220650532/BnmMGBraU

http://nlp.stanford.edu/pubs/HuangACL12.pdf