We need a way to easily run scraping, crawling, and classification tasks in the cloud using the dashboard app.
Solution
Created a different backend for the dashboard app, allowing to:
Download a model
Upload a model
run scraping, crawling and classification using cloud functions
Created cloud functions for:
Crawling
Classification
Scraping
Created class to manage objects stored in google cloud
Created a persistency manager object to store data in the cloud
Partial data is stored in a firestore collection
Complete data is stored in big query
Relevant files:
src/c4v/dashboard/app.py and src/c4v/dashboard/main.py : Support for new backend and changed layout depending on the active backend
src/c4v/cloud/gcloud_storage_manager.py : Created object to manage gcloud-stored objects, models in this case
src/cloud_functions/dev-microscope-classify : folder with classification cloud function
src/cloud_functions/dev-microscope-scrape : folder with scrape cloud function
src/cloud_functions/dev-microscope-crawl : folder with scrape cloud function
src/cloud_functions/dev-microscope-move : folder with move cloud function. It's necessary to move data from firestore to big query once it's ready to do so.
Problem
We need a way to easily run scraping, crawling, and classification tasks in the cloud using the dashboard app.
Solution
Relevant files:
src/c4v/dashboard/app.py
andsrc/c4v/dashboard/main.py
: Support for new backend and changed layout depending on the active backendsrc/c4v/cloud/gcloud_storage_manager.py
: Created object to manage gcloud-stored objects, models in this casesrc/cloud_functions/dev-microscope-classify
: folder with classification cloud functionsrc/cloud_functions/dev-microscope-scrape
: folder with scrape cloud functionsrc/cloud_functions/dev-microscope-crawl
: folder with scrape cloud functionsrc/cloud_functions/dev-microscope-move
: folder with move cloud function. It's necessary to move data from firestore to big query once it's ready to do so.Further work