diux-dev / cluster

train on AWS
75 stars 15 forks source link

add documentation for file structure #74

Closed dreamflasher closed 5 years ago

dreamflasher commented 5 years ago

https://github.com/diux-dev/cluster/tree/master/pytorch#data-preparation describes are certain file structure, but with the data not available on s3 anymore (https://github.com/diux-dev/cluster/issues/73), it's unclear what is supposed to be in which folder.

Which files are supposed to be in: imagenet-sz

I assume it's about the file size, das 160 mean all files smaller than 160? And 320 all the rest? Or is there some preprocessing of the original imagenet files necessary?

yaroslavvb commented 5 years ago

Curious, how did you find this repo? It's actually out of date, the imagenet in 18 minutes got moved to https://github.com/cybertronai/imagenet18

Files are now baked into an AMI

dreamflasher commented 5 years ago

Well you linked to this repo from here: https://github.com/cybertronai/imagenet18 – but you removed that part now, so now there's no documentation anymore on how to run this without that image/on a machine outside of amazon.

yaroslavvb commented 5 years ago

Ah yes, I discovered it after asking. It looks like DIUX has restricted access to those files so it's no longer possible to train outside of Amazon.

yaroslavvb commented 5 years ago

It looks like the AMI is still accessible though. I copied it just in case: In Virginia region:

original AMI: ami-0644a2a9fcebda350
copied AMI: ami-0b7c8e6f44e9889b1
dreamflasher commented 5 years ago

Ah okay, too bad that the files are offline now. Thanks for copying the AMI!