Closed qxcv closed 8 years ago
In the test sequences:
.seqs
gives the frames in each sequence, and contains 39688 floats. This takes up maybe 300KB..data
is a 1x396316 struct array, where each entry has a lot of fields. Matlab says it takes up 532MB.In the train data:
.pairs
is a 1x170850 struct array, where each entry has three scalar fields: fst, snd and scale. This should take around 4MB (we don't care about it).test_seqs
, .data
is huge. This time it's 1x1712255 and takes up a whopping 2.3GB.It seems that in total, my data is "only" 2.8GB, but Matlab's shitty "compression" is doubling the size.
Some ideas for saving space:
.data
:
.pose_path
..video_path
. Reconstruct it from the saved action, subject and camera.[]
frame_no
, frame_time
, subject
and camera
can all become int32s.joint_locs
is already saved as singles, but further space might be saved by removing joints which aren't strictly necessary (e.g. going down to a 16 joint model).It looks like structs take a huge HDF5 encoding overhead with Matlab's v7.3 save format, so I may have to modify the save process slightly so that big fields are saved in their own variables (of the appropriate numeric type).
At the moment it's 7.2GB on disk (!!), and that's with Matlab's compression. A cursory calculation suggests that joint data alone will take up 1.8GB on disk (when all sequences are summed together). Clearly, I need a better way of loading data than what I've been using. Some sort of lazy loading might help.