rkjones4 / GANGogh

Using GANs to create Art
358 stars 123 forks source link

Images Zip on Google Drive #9

Open lindseycarr1977 opened 6 years ago

lindseycarr1977 commented 6 years ago

Trying to get the image files on Google Drive. It downloads and then says 'invalid file'. Tried twice on 2 different Windows computers. I have a slow connection so didn't want to try the torrent and have the same issue so -

For anyone having the same issue: I tried the scrape_wiki.py but the url's have changed since then - I managed to get them by finding the json files associated with pagination.

Replaced soupit like so:

def soupit(j,genre): try: url ="https://www.wikiart.org/en/paintings-by-genre/"+genre+"?json=2&page="+str(j)

    jsonP = urllib.request.urlopen(url)
    data = json.loads(jsonP.read())
    urls=[]

    for artItem in data["Paintings"]:
        urls.append(artItem["image"])

    return urls
except Exception as e:
    print('Failed to find the following genre page combo: '+genre+str(j))
vandan-revanur commented 4 years ago

@lindseycarr1977 Thanks for this code snippet, I was facing the same error and this helped me. On a general note, is there a way to use this kind of JSON method for other image dataset websites? Is there a general format that we can add after the URL to access the JSON data?