KaiDMML / FakeNewsNet

This is a dataset for fake news detection research
1.1k stars 429 forks source link

Number of news shares is exactly the same for all fake and real news documents #5

Closed phosseini closed 5 years ago

phosseini commented 5 years ago

Based on the explanations in readme file and using the NewsUser.txt and UserUser.txt files, I computed some statistics about the documents in the dataset, specifically, the number of times that news articles, whether fake or real, were shared. Here are the results:

Fake PolitiFact: 120
Real PolitiFact: 120
Sum PolitiFact: 240
---------------------
Fake Buzzfeed: 91
Real Buzzfeed: 91
Sum Buzzfeed: 182
---------------------
Sum all: 422
Fake spread count: 20683
Real spread count: 20683
---------------------
Fake affected count: 639982
Real affected count: 639982

All the numbers are exactly the same for both fake and real news documents and both for PolitiFact and Buzzfeed. I am wondering if I did something wrong or if there is an issue with the number of shares in the dataset?

phosseini commented 5 years ago

I just realized that I was using the wrong ids for news articles. In the following there are the updated results just in case if anyone is interested:

> Fake PolitiFact: 114
> Real PolitiFact: 120
> Sum PolitiFact: 234
> ---------------------
> Fake Buzzfeed: 89
> Real Buzzfeed: 91
> Sum Buzzfeed: 180
> ---------------------
> Sum all: 414
> Fake spread count: 40416
> Real spread count: 20683
> ---------------------
> Fake affected count: 1049276
> Real affected count: 639982

P.S. the following news articles are removed because they do not have nay content/text:

BuzzFeed_Fake_13-Webpage BuzzFeed_Fake_39-Webpage PolitiFact_Fake_24-Webpage PolitiFact_Fake_29-Webpage PolitiFact_Fake_37-Webpage PolitiFact_Fake_47-Webpage PolitiFact_Fake_70-Webpage PolitiFact_Fake_90-Webpage

JihoChoi commented 5 years ago

Shouldn't there be 91 Fake BuzzFeed news? How did you get the number 89?

Fake Buzzfeed: 89

phosseini commented 5 years ago

Shouldn't there be 91 Fake BuzzFeed news? How did you get the number 89?

Fake Buzzfeed: 89

As I mentioned, in my case, I removed a couple of news articles (listed above in my post) since they did not have any content/text.