johndpjr / AgTern

19 stars 5 forks source link

Background scrape task with celery #189

Closed rutaceae closed 10 months ago

rutaceae commented 10 months ago

Closes #99

The celery worker has been implemented in the background.py module located in the AgTern/backend/scraping/ directory. To start the celery worker, run the following command in the project root: celery -A backend.scraping.background worker -B --loglevel=info

Celery is configured to use a Redis server as a message broker. This server must be present on port 6379 (default Redis port) for the worker to function. requirements.txt has been adjusted to include the celery Redis package.

It is scheduled to run every day at 12 AM CST. At the moment, it will scrape all companies. Celery has much flexibility when it comes to scheduling; if more specific scraping jobs need to be implemented in the future, they can be.

While the worker is active, a scrape can be started at anytime using another terminal:

python3
from background import run
run.delay()
johndpjr commented 10 months ago

Successful run

image