coolneighbors / unWISE-verse

Integrated flipbook / panoptes pipeline
MIT License
2 stars 0 forks source link

investigate disk space usage of WiseView downloads #44

Closed ameisner closed 2 years ago

ameisner commented 2 years ago

Do we need to be worried about how much disk space would be used up by the files we download from WiseView in a large run of the full pipeline, say on order 10k-100k sky locations?

In my CSSFP proposal, I wrote "The full set of 120,000 image blinks will be approximately 1.8 GB in size", but I have no idea where I got that number. If the total size for ~100k image blinks is only a couple of GB then I think we just declare that to be a trivially small amount of data to work with and never worry about it.

Probably this has some relation to what 'scale factor' we apply to the raw WiseView downloads and whether we can figure out how to have Zooniverse do that rescaling for us.

Do we need some sort of "buffering" that operates in "chunks" to download from WiseView, upload to Zooniverse, then delete local copies of files before moving on to the next chunk?

ameisner commented 2 years ago

A somewhat related point is the local directory structure of the WiseView files we download. It's bad practice to have a single directory with more than ~10k files in it. Do we need to define some directory structure with subdirectories to avoid this?

ameisner commented 2 years ago

Noah reports ~27 kb per WiseView PNG (no rescaling), so then ~270 kb per WiseView GIF with no rescaling assuming 10 frames (this depends on the windowing requested and also the sky location). These numbers increase to ~40 kb and ~400 kb respectively for the case of 3x rescaling.

ameisner commented 2 years ago

For the successful 10k test that ended on 2022 June 24, I get:

(zpipe) ameisner@cori04:/global/cscratch1/sd/ameisner/ZPipe_104> du -hs pngs/
418M    pngs/
(zpipe) ameisner@cori04:/global/cscratch1/sd/ameisner/ZPipe_104> ls pngs |wc -l
98742

So then this would be ~5 GB for 120k targets. This was using the default ZPipe GUI parameters: scale factor = 1, FOV = 120 asec.