This project has two main deployments:
collector web
┌────────────────────────────┐ ┌─────────────────────────────┐
│ claimreview_collector_full │ │ misinfo_server │
│ port:20500 │ │ port:20000 │
├────────────────────────────┤ ├─────────────────────────────┤
| flaresolverr │ │ twitter_connector │
| port: 8191 │ │ port:20200 │
├────────────────────────────┤ ├─────────────────────────────┤
| dirtyjson │ │ credibility │
| port:12345 │ │ port:20300 │
├────────────────────────────┤ ├─────────────────────────────┤
| mongo │ │ claimreview_collector_light │
| port:20600 │ │ port:20400 │
└────────────────────────────┘ ├─────────────────────────────┤
│ mongo │
│ port:20700 │
└─────────────────────────────┘
The declaration is implemented with the scripts scripts/start_services_web.sh
and scripts/start_services_cllectr.sh
.
This deployment is updating every day the dataset and uploading it to a daily release stored on the claimreview-data repository, named with the format YYYY_MM_DD
.
The container claimreview_collector_full
is an instance of the claimreview-collector docker image and is configured with environment variables:
ROLE=full
: enables the daily updatePUBLISH_GITHUB=true
: to enable publishing data to GitHubGITHUB_TOKEN
: to enable publishing data to GitHubGOOGLE_FACTCHECK_EXPLORER_COOKIE
: needed to collect from Google Fact-check ExplorerFLARESOLVERR_HOST
: where to find the instance of flaresolverr (to collect from websites and avoid captchas)DIRTYJSON_REST_ENDPOINT
: where to find dirtyjson microserviceMONGO_HOST
: where to find mongodb (to store data)MISINFO_BACKEND
: where to find the public web API (to use the unshorten API)TWITTER_CONNECTOR
: where to find the connector to twitter APIFlaresolverr is used to avoid the captchas. Dirtyjson is used to fix some broken jsons. Mongo is used to store data.
The web deployment is more elaborated. It is made of a frontend and many microservices.
The main container is misinfo_server
that is an instance of misinfome-backend that acts as public API and also serves the frontend.
Twitter microservice twitter_connector
is responsible to retrieve timelines, users, tweets.
Credibility microservice credibility
computes aggregated scores of credibility.
Claimreview microservice claimreview_collector_light
provides access to the data
Docker-compose
# without docker-compose installed
docker build . -t martinomensio/misinfome
# run docker-compose image that will do the same:
# - mount /var/run/docker.sock to give control of docker
# - mount .env file that contains all the env variables with secrets
# - CLAIMREVIEW_DATA_PATH: pass the host data path so that the container know where it is to run docker-compose
# - COMMAND can be start.web or start.collector
# web
docker run -it --name mm35626_misinfome \
--restart unless-stopped \
-v /var/run/docker.sock:/var/run/docker.sock \
-v `pwd`/.env:/MisinfoMe/.env \
-e CLAIMREVIEW_DATA_PATH=`pwd`/claimreview-collector/data \
-v `pwd`/backend/app-v2:/MisinfoMe/backend/app-v2 \
-e FRONTEND_V1_PATH=`pwd`/backend/app-v1 \
-e FRONTEND_V2_PATH=`pwd`/backend/app-v2 \
-e INTERACTIVE=1 \
-e COMMAND=start.web \
martinomensio/misinfome
# collector
docker run -it --name mm35626_misinfome \
--restart unless-stopped \
-v /var/run/docker.sock:/var/run/docker.sock \
-v `pwd`/.env:/MisinfoMe/.env \
-e CLAIMREVIEW_DATA_PATH=`pwd`/claimreview-collector/data \
-e INTERACTIVE=1 \
-e COMMAND=start.collector \
martinomensio/misinfome
# with docker-compose installed
COMMAND=start.web bash scripts/main.sh
COMMAND=start.web.dev bash scripts/main.sh
COMMAND=start.web TWITTER_CONNECTOR_TAG=dev bash scripts/main.sh
COMMAND=start.collector bash scripts/main.sh
COMMAND=start.collector.dev bash scripts/main.sh
COMMAND=start.collector TWITTER_CONNECTOR_TAG=dev bash scripts/main.sh
TODOs:
The submodules self-update the dependencies to avoid security vulnerabilities, and if successful they update the main repository to merge the changes.
dev
latest
to the docker imagesThe frontend instead is built and stored as an artifact, and deployed with the scripts/download_frontend.sh
script.
Apache configuration
/etc/httpd/sites/external
misinfo.me
# section :80
## Misinfo Service (mm35626):
# HTTPS https://cwiki.apache.org/confluence/display/httpd/RewriteHTTPToHTTPS
RewriteEngine On
RewriteCond %{HTTPS} !=on
RewriteRule ^/?(.*) https://%{SERVER_NAME}/$1 [R,L]
ProxyPass / http://127.0.0.1:20000/
ProxyPassReverse / http://127.0.0.1:20000/
AllowEncodedSlashes NoDecode
# section :443
## Misinfo Service (mm35626):
ProxyPass / http://127.0.0.1:20000/
ProxyPassReverse / http://127.0.0.1:20000/
AllowEncodedSlashes NoDecode
sudo systemctl restart httpd.service