Closed askerlee closed 2 years ago
Unfortunately the full SDSS dataset is huge, so I can't host the images myself (although I will look into a torrent-based solution...). The script worked for me last time I used it a couple months ago, so it's odd that it has stopped working. Are you trying to download the SDSS images or the PROBES images?
I see. I'm downloading the probes images. So the SDSS images are even larger? :sweat_smile: Would the SDSS be the main dataset for the model training? Thanks.
Yes the full SDSS dataset is around a TB of data total with 306,000 galaxies. I use that data to perform the statistical analysis that is in the paper, and use the PROBES dataset to produce the pretty galaxies you see in the figures (as those galaxies are large and particularity well resolved with no obscuring foreground objects).
BTW I just checked and you should expect ~2000 PROBES galaxies total, so looks like the script is still working!
I see. This is very helpful info. In preprocess.py, I saw there are three channels G, R, Z that are combined into one image. Could you teach me how to understand these channels, i.e., how to convert them into RGB? Thanks.
Here we scrape from the DESI Legacy Survey DR9, check out the write up here: https://www.legacysurvey.org/dr9/description/.
They use the photometric system (https://en.wikipedia.org/wiki/Photometric_system), and you can think of g as "green", r as "red", and z as "near infrared". So you can map g -> blue, r -> green, and z -> red to make RGB imagery.
I see. Thank you! 😄
I tried to use your download script to download the images, but found that many images are empty (0 bytes), and by manually visiting the URL I got "500 server error" or "301 Moved permanently". Eventually only around
2k6k images were downloaded successfully. (They were converted to around 2k npy files.)So, I wonder could you please share the images you downloaded? I guess if every user downloads their own copy of the images, it also adds load to the server.
Thank you so much!