KaiDMML / FakeNewsNet

This is a dataset for fake news detection research
1.11k stars 432 forks source link

Trying to parse data for fake news detection #2

Closed sailee18dalvi closed 6 years ago

sailee18dalvi commented 6 years ago

Hello, I am trying to work with your data for my Master's thesis, but I am not able to get a proper dataset, your given dataset is in .json format, which I converted into .csv format but the column names do not match due to some missing columns, hence the data is not properly being displayed, is there any possibility that you could share the latest dataset or help me by listing the features you used for your project. Thanks in advance :)

giladgressel commented 6 years ago

@sailee18dalvi I think in order for the maintainers to help you at all you will have to provide more details into what you have done so far.

Can you please post your code that you are using to extract the json?

sailee18dalvi commented 6 years ago

I tried using the simple code, import pandas as pd import json pandas.read_json("/home/Jupyter/BuzzFeed/FakeNewsContent/BuzzFeed_Fake_1-Webpage.json")

I get a error ( ValueError: arrays must all be same length )

I converted the .json files to .csv using an online converter, after which I tried printing them pd.read_csv("/home/Jupyter/BuzzFeed/FakeNewsContent/BuzzFeed_Fake_1-Webpage.csv")

the column names in it differ.

mdepak commented 6 years ago

These JSON files contain the news articles related to fake news and real news.

The json can be loaded using the following code obj = json.load(open("BuzzFeed_Fake_1-Webpage.json"))

The text content of the news article is in the text field and it can be accesed using obj["text"] This json also contains various meta data related to the published news article. Hope this answers your question.

phosseini commented 6 years ago

I wrote a python script to convert the data to a single CSV file. Can share it with you if you want.

sailee18dalvi commented 6 years ago

@phosseini That would be great, Thank you

phosseini commented 6 years ago

@sailee18dalvi you can download it from here Please don't forget to cite fakenewsnet papers!

mdepak commented 6 years ago

Thanks @phosseini for helping @sailee18dalvi providing CSV version of the dataset. Closing this issue.