cannot open the pkl file #1

Open yaoli opened 8 years ago

yaoli commented 8 years ago

File "sample.py", line 1, in import cPickle as pkl File "sample.py", line 65, in main tasks = pkl.load(fin) UnpicklingError: invalid load key, 'v'.

ffmpbgrnn commented 8 years ago

Hi Li, can you please check the md5 of pickle files in the datasets directory which should be like:

08905c5f08f43b86abc65644a9955b2f  MED_easy.pkl
cb16a7768b82310388a8744e07ab0943  MED_hard.pkl
dd3523207f7d17d4e5a10f9c392903f9  MPII_easy.pkl
be712f8b8eaa278f9ac13fbea904fbdf  MPII_hard.pkl
fb68111257defb82fc761471a9c79a18  TACoS_easy.pkl
87fe81234f8f57679a399df1aa89a2d1  TACoS_hard.pkl
ec486bb522a31f0d3d091303184ef7dc  meta_tasks.pkl

you can just check by typing md5sum *.

I just downloaded the dataset again and the sample.py runs smoothly.

Did you install git-lfs and initialize it? If not, you should go to https://github.com/github/git-lfs and install it and then initialize with git lfs install. Now you can do normal git clone

yaoli commented 8 years ago

I didn't install git-lfs, not sure what it's actually doing. Why does loading pkl fail without it? I'm using a Linux machine.

ffmpbgrnn commented 8 years ago

git-lfs is used for large files. The pkl files are not stored in the original repository, but are tracked by git lfs. So you need git-lfs to download the pkl files. If you don't want to use git-lfs, you can download them manually, for example, go to https://github.com/ffmpbgrnn/VideoQA/blob/master/datasets/MED_easy.pkl and click raw.

ffmpbgrnn commented 8 years ago

If you cat the output of pkl files you clone, you can see something like

version https://git-lfs.github.com/spec/v1
oid sha256:acd317d3d403a2d2cee46f0b3e3bdde142266c2ac14e613b12ceb6f75d102666
size 25928870

As can be seen, it is only a pointer to the original file.

yaoli commented 8 years ago

Oh, I see. That's why. Thanks a lot. I'll try it later.

yaoli commented 8 years ago

It works. Thanks a lot. From what I can see in those pkl files, videos on TACoS are further chopped into small short clips. To be consistent, it would be super handy for us to use your script to chop them and assign the same video IDs used in those pkl files. I hope we are not asking for too much.

ffmpbgrnn commented 8 years ago

Hi Li, if you downloaded TACoS annotations, there is a index.tsv file which defines the correspondence between clip name and the actual frames in the video. For example,

s37-d46_19_7›   The man takes the slices of pineapple and places them on top of each other to nicely dice them into squares.›   s37-d46›6645›   11145›  pineapple›  11196

it shows the clip s37-d46_19_7 corresponds to the video s37-d46.avi between frame 6645 and 11145.

I simply use ffmpeg to extract all frames rather than chop the whole video into small clips.

yaoli commented 8 years ago

Thanks very much indeed.

ffmpbgrnn commented 8 years ago

No worries.

yaoli commented 8 years ago

{'questions': [{'T': 'person put pan on oven .', 'W3': 'leek', 'W2': 'start', 'W1': 'unwrap', 'W0': 'heat'}, {'T': 'person put on heat oven .', 'W3': 'proceed', 'W2': 'pull out', 'W1': 'bowl', 'W0': 'pan'}, {'T': 'person pan on heat oven .', 'W3': 'attempt', 'W2': 'open up', 'W1': 'enlarge', 'W0': 'put'}, {'T': 'person heat oven .', 'W3': 'cut off top of pepper', 'W2': 'get cut board from drawer', 'W1': 'cut slice in half', 'W0': 'put pan on'}, {'T': 'person put pan on heat .', 'W3': 'tap', 'W2': 'room', 'W1': 'rag', 'W0': 'oven'}, {'T': 'person put on heat oven .', 'W3': 'stone', 'W2': 'faucet', 'W1': 'frame', 'W0': 'pan'}], 'desc': u'Person puts pan on heated oven.'}

This example is a bit strange as the blanks are phrases, instead of single words. Are this kind of QA pairs used at all in training?

Here are more of answer candidates from the entire dataset, the number indicate number of appearance:

ffmpbgrnn commented 8 years ago

The answers can be phrases, like put on, take off, cut off etc, however, these phrases have much lower frequencies than single word.