Cite the data: http://datasets.coronawhy.org/dataset.xhtml?persistentId=doi:10.5072/FK2/FMB3QB
@data{FK2/FMB3QB_2020,
author = {audhiaprilliant/Web-Scraping-Covid19-Kompas-News},
publisher = {COVID-19 Data Hub},
title = "{Web Scraping Covid19 Kompas News}",
year = {2020},
version = {V1},
doi = {10.5072/FK2/FMB3QB},
url = {https://doi.org/10.5072/FK2/FMB3QB}
}
The program is easy to run by following steps:
pip install -r requirements.txt
python3 'Web Scraping Covid-19 Kompas News.py'
Two possibilities that we have:
We could also automate the program by using crobtab scheduler in Linux. Follow steps below to configure the crontab:
crontab -e
in your terminal to add a new crobjobwhereis python3
. It must be saved in /usr/bin/python3
directory45 16 * * * cd /your path of web scraping script/ && /usr/bin/python3 'Web Scraping Covid-19 Kompas News.py' >> test.out
If you feel a little bit confuse with above command, let me tell you what I know
45 16 * * *
is our schedule. The crontab uses our local time machine instead of UTC. So our program is going to be running at 16.45 everyday for every month/your path of web scraping script/
must be the directory where you keep the python script. In my case, it is in 'home/covid19 data'/usr/bin/python3
is the directory of Python3 interpreter>> test.out
implies that the file test.out
would be created and as logs for the outputsDocker is a set of platform as a service (PaaS) products that delivers software in packages called containers. Containers are isolated from one another and bundle their own software, libraries and configuration files; they can communicate with each other through well-defined channels. All containers are run by a single operating system kernel and therefore use fewer resources than virtual machines.
sudo apt-get update
sudo apt-get remove docker docker-engine docker.io
sudo apt install docker.io
sudo systemctl start docker
sudo systemctl enable docker
docker build -t IMAGE_NAME:TAG .
docker build -t web-scraping-covid-kompas:1.0 .
docker images
docker run USERNAEM/IMAGE_NAME:TAG
docker run web-scraping-covid-kompas:1.0 .