This project is the implementation part of my undergraduate final year project, titled "Using Dynamic Knowledge Graph for Fake News Detection". The final report is available to read in the report.pdf file
This repository is hosted in https://github.com/albertus-andito/fake-news-detection.
The package documentation can be found at /docs/_build/html/index.html or /docs/fakenewsdetection.pdf.
Below is a short explanation of the content of each directory and top-level file in this repository:
In order to run this project locally, you will need the followings:
Docker and Docker Compose.
This is required for running DBpedia. If you do not already have Docker and Docker Compose, install it from https://docs.docker.com/engine/install/ and https://docs.docker.com/compose/install/.
A running DBpedia instance.
Follow the instructions in https://github.com/dbpedia/virtuoso-sparql-endpoint-quickstart to get DBpedia running locally. Note that it is not strictly required to have the full DBpedia loaded for demonstration purpose, as it could take hours to load. Alternatively, use a smaller collection instead, as also instructed in their documentation.
MongoDB
This is required to store the scraped articles and their extracted triples. If you do not already have it, install it from https://www.mongodb.com/try/download/community.
Conda
This project uses Conda environment to manage Python and its packages. If you do not already have it, install Anaconda (including Conda) from https://www.anaconda.com/products/individual.
Node.js
This project uses Node.js for the User Interface. If you want to run the UI, you will need Node.js. If you do not already have it, install it from https://nodejs.org/en/download/
Stanford CoreNLP
This is used for the triple extraction process. Download it from https://stanfordnlp.github.io/CoreNLP/. Java is required to run this.
(Optional) IIT OpenIE
This can also be used for the triple extraction, as an alternative to Stanford OpenIE. Changes are required in the appropriate places in the code. Download it from https://github.com/dair-iitd/OpenIE-standalone.
(Recommended) Guardian Open Platform API Key
In order to smoothly scrape content from The Guardian, the Guardian API is used. Register for the developer key here: https://bonobo.capi.gutools.co.uk/register/developer.
From the terminal, run:
conda env create --file environment.yml
Once the Conda environment is installed, additionally you will need to install NeuralCoref library which needs to be built locally because the Spacy version used in this project is later than version 2.1.0.
Run the following commands:
git clone https://github.com/huggingface/neuralcoref.git
cd neuralcoref
pip install -r requirements.txt
pip install -e .
Create .env file by copying the content of .env.default file. Fill out all of the necessary values, or replace the default ones.
Install the UI.
Go to the ui folder (cd ui
), and run the following commands:
npm install
cp .\src\style\my-theme.less .\node_modules\antd\dist\
cd .\node_modules\antd\dist\
lessc --js my-theme.less ..\..\..\src\style\custom-antd.css
This project consists of 4 components that can be run individually.
The REST API is needed to access the main functionalities of this project. It is also needed when running the UI.
java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer \
-preload tokenize,ssplit,pos,lemma,depparse,natlog,openie \
-port 9000 -timeout 15000
conda activate fake-news-detection
python -m api.main
The UI provides an intuitive way for users to interact with the REST API and the project as a whole.
Make sure the REST API is running.
Go to the ui folder (cd ui
), and run the following command: npm run start
.
The UI can be accessed through http://localhost:3000
If you want to run the article scraper that runs periodically all the time, you need the followings:
python -m articlescraper.main
If you want to have the triples extracted from the recently scraped articles all the time, you need the followings:
python -m knowledgegraphupdater.kgupdaterrunner