Closed haoawesome closed 10 years ago
related papers
http://ijcai.org/papers13/Papers/IJCAI13-388.pdf Social Spammer Detection in Microblogging
http://conferences.sigcomm.org/imc/2011/docs/p243.pdf Suspended Accounts in Retrospect: An Analysis of Twitter Spam
http://share.iit.edu/handle/10560/2902 SPAM DETECTION IN SOCIAL NETWORKS: A CASE STUDY OF WEIBO
http://dl.acm.org/citation.cfm?id=2501035 Analysis and identification of spamming behaviors in Sina Weibo microblog
http://trec.nist.gov/data/tweets/ use that with this dataset
http://plg.uwaterloo.ca/~gvcormac/spam/ TREC 2005- 2007 spam data
http://www.dt.fee.unicamp.br/~tiago/smsspamcollection/ SMS spam
http://wing.comp.nus.edu.sg:8080/SMSCorpus/overview.jsp
http://www.ccert.edu.cn/spam/sa/datasets.htm chinese email
https://archive.ics.uci.edu/ml/datasets/Spambase UCI dataset 1999
http://www.aueb.gr/users/ion/data/enron-spam/ enron spam email 2006
http://csmining.org/index.php/ling-spam-datasets.html multiple spam datasets (most emails)
http://untroubled.org/spam/ up to 2014 email
https://spamassassin.apache.org/publiccorpus/ email 2004
http://artinvoice.hu/spams/ email
http://www.infochimps.com/tags/spam list of spam datasets
问: @杨洋MQ Social Network 中 Spammer Detection 方面 都有哪些 中文、英文的数据集?答: 初步回答:http://memect.co/U19k-Oo 公开的大多是email,中文较老有2006 TREC , 2005 CCERT;英文有Twitter数据集和Spammer列表。近年未公开:berkeley, ASU有Twitter研究; 国内要联系上交大。
最近几个问答也相关 2014-10-16 垃圾邮件分类有什么数据集么? http://t.cn/R7L0GJZ 2014-10-16 有没有垃圾邮件检测的项目啊? http://www.weibo.com/5220650532/BrOXC1Qkq 2014-10-28 识别垃圾与虚假信息 https://github.com/memect/hao/issues/307
http://www.weibo.com/1214738093/BjP4YcZT2