Closed acroutworst closed 7 years ago
Hi Adam,
Unfortunately this is the expected behavior, as there are over 80000 full size portraits to download, so it can take up to 10 hours to run. I don't believe there is currently a way to download a copy of the dataset directly, as the dataset for this project needs to be sorted into buckets by genre. If you need the script to run faster you could try changing the numbers of pages the script scrapes for each genre (look at the comments in the code) or downloading a different version of the dataset and finding a way to sort it by genre.
Yes, I will try changing the numbers of pages the script scrapes for each genre. Thanks for the suggestion.
Could we maybe host the dataset so we don't destroy Wikiart's servers? Maybe a torrent?
Hello there,
I have been running scape_wiki.py and scraping through Wikiart. It has been well over an hour and is still going strong as it is scraping! Is this expected for this process to be running this long?
By the way, here is a preview as to what I am seeing:
Cheers, Adam