shahsohil / DCC

This repository contains the source code and data for reproducing results of Deep Continuous Clustering paper
MIT License
208 stars 53 forks source link

Empty data folder on clone or download as zip #12

Closed LemonPi closed 5 years ago

LemonPi commented 5 years ago

The MNIST data after cloning or downloading holds empty files - each file is only around 130 bytes and their content is actual text with something like:

version https://git-lfs.github.com/spec/v1
oid sha256:e60446c5fac6df3e3f37769ca5b51669a2da7d6a3a6abc9fc9a8cc2b4244a18d
size 26650347

So it's actually the file descriptor instead of the actual file. It would be best to provide an alternative source for the whole MNIST data set including the checkpoints in addition to the .mat files that already exist.

For any future searches that encounters an error like

    magic_number = pickle_module.load(f)
_pickle.UnpicklingError: invalid load key, 'v'.

This is because the checkpoint is actually empty...

zhuowei commented 5 years ago

@LemonPi It looks like this repository uses git-lfs: You might want to setup git-lfs and try cloning again.

If that works, then I guess that this needs to be added to the README.

LemonPi commented 5 years ago

Even so I think it makes sense to not track generated files (so the entire data directory should be ignored).

shahsohil commented 5 years ago

@LemonPi It looks like this repository uses git-lfs: You might want to setup git-lfs and try cloning again.

If that works, then I guess that this needs to be added to the README.

Yes, the repository was setup using git-lfs. However, one should now ignore those files and can instead directly download all the relevant files from here