Closed mindyh closed 8 years ago
Currently looking into https://www.crummy.com/software/BeautifulSoup/bs4/doc/
Nope, nvm, doesn't have fav or views data.
Okay we now have a scraper. Yay!
You can use it to scrape a search results page like so:
./scraper.py -r http://www.deviantart.com/browse/all/digitalart/paintings/people/portraits/
This will add every single image in the search results to our database, including favorites, view count, medium, and url to the actual image.
You can also scrape a single image page and add only data for that image to our database, like so:
./scraper.py -s http://www.deviantart.com/art/Reds-606604260
Finally you can update the scraped data for all images currently in the database like so:
./scraper.py -u
Note that this doesn't automatically run the feature extractor over the images. You'd still have to call ./extractor.py
to actually run all the image analysis stuff (which will go into the database, find the newly added rows, actually download the images, run the feature extractors, and fill out the rest of the columns).
Wow, so fast. How long does it take to scrape an image? And just making sure, you put a delay in right, I don't want to get banned >.<
On Sun, May 22, 2016, 1:00 AM Ben-han Sung notifications@github.com wrote:
Okay we now have a scraper. Yay!
You can use it to scrape a search results page like so:
This will add every single image in the search results to our database, including favorites, view count, medium, and url to the actual image.
./scraper.py -r http://www.deviantart.com/browse/all/digitalart/paintings/people/portraits/
You can also scrape a single image page and add only data for that image to our database, like so:
./scraper.py -s http://www.deviantart.com/art/Reds-606604260
Note that this doesn't automatically run the feature extractor over the images. You'd still have to call ./extractor.py to actually run all the image analysis stuff.
— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/bhnascar/Viral-Art/issues/5#issuecomment-220819610