Data4Democracy / are-you-fake-news

16 stars 3 forks source link

Dockerization - Data collection #19

Closed N2ITN closed 5 years ago

N2ITN commented 5 years ago

Status

Assigning to @ (unassigned) Please use this branch https://github.com/N2ITN/are-you-fake-news/tree/develop-dockerize

Issue

This service will dockerize the collection and cleaning of new data for training. This issue must be completed before model training is possible. The code for scraping news articles with their labels, cleaning them and storing in mongodb already exists in ./docker/gather_data/ Create a second docker-compose file for data collection and model training.

Tasks

so that it will integrate with the existing mongo configuration.