Mauville / MedCLIP

Medical image captioning using OpenAI's CLIP
57 stars 14 forks source link

Change all URLs to new CDN #7

Open Mauville opened 2 months ago

Mauville commented 2 months ago

Medpix has moved all their content off to a cdn

Old links looked like https://medpix.nlm.nih.gov/images/full/synpic52419.jpg

Now they look like https://d168r5mdg5gtkq.cloudfront.net/medpix/img/full/synpic17159.jpg

The dataset links need to be modified. It appears that a simple rename should work, but if the cdn is constantly changing, then this could become a reoccurring problem.

A simple fix for the scraper is adding the following line

filename = url.split("/")[-1]
    url= f"https://d168r5mdg5gtkq.cloudfront.net/medpix/img/full/{filename}"
    urllib.request.urlretrieve(url, f"/content/drive/Shareddrives/DeepLearning/data/output/{filename}")
Qasim-Latrobe commented 2 months ago

Thanks for the prompt response and guidance. Yes, I am able to download the madpix dataset 🙂

SID-6921 commented 3 weeks ago

how can i get the dataset help me please ?