pennpolygons / cv-boilerplate

Open-source boilerplate for computer vision research
MIT License
16 stars 2 forks source link

Generating scripts/modules to scrape images/videos #6

Open mchiquier opened 4 years ago

mchiquier commented 4 years ago

Ideally this is modular but it would be good to have a boilerplate for scraping data from:

-twitter -youtube -tiktok -instagram

Or even just google drive & pipe-lining it into a PyTorch dataloader.

goodmattg commented 4 years ago

This is extremely difficult, and arguably it's own project. I consider this a "nice to have", but it should be the last thing we do.

ChickenTarm commented 4 years ago

scraping data also leads to problematic territory like terms of service and them just straight up blocking your ass for taking their data, etc... And not all sites have nice apis for scraping, so some general scraper would be hard.