googleinterns / play-movies-2020-intern

Apache License 2.0
1 stars 1 forks source link

Scrape Assets from OMDB #41

Closed jackstenglein closed 4 years ago

jackstenglein commented 4 years ago

Scrapes assets from OMDB by first scraping a list of asset ids from a provided IMDB url. Each asset id is then used to query the OMDB api. I'll be scraping the IMDB top 250 movies and top 250 tv shows, but other IMDB pages should be possible to scrape as well. Also changes the in-memory database to a file database, as I didn't want to query OMDB 500 times every time I restarted the server.

jackstenglein commented 4 years ago

I am just manually calling the scrape endpoint using Postman to get the sample data.

TianyiMu commented 4 years ago

I am just manually calling the scrape endpoint using Postman to get the sample data.

Sounds good