hotpotqa / hotpot

Apache License 2.0
445 stars 75 forks source link

MemoryError during preprocessing #1

Closed TianboJi closed 6 years ago

TianboJi commented 6 years ago

The error message is:

Traceback (most recent call last):
  File "main.py", line 86, in <module>
    prepro(config)
  File "/extradisk/jitianbo/workspace/HotpotQA/prepro.py", line 349, in prepro
    build_features(config, examples, config.data_split, record_file, word2idx_dict, char2idx_dict)
  File "/extradisk/jitianbo/workspace/HotpotQA/prepro.py", line 306, in build_features
    pickle.dump(datapoints, open(out_file, 'wb'), protocol=-1)
MemoryError

My mem size is 65876000kB and my system info is Linux lly-GPU 4.2.0-30-generic #36~14.04.1-Ubuntu SMP Fri Feb 26 18:49:23 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

kimiyoung commented 6 years ago

Could you please replace line 306 of prepro.py with

import torch
torch.save(datapoints, out_file)

and see if this can fit into your RAM?

In my test this could reduce the memory usage quite a bit.

TianboJi commented 6 years ago

Hi, I think your code does help to reduce the memory usage. However, this time I got another error:

Traceback (most recent call last):
  File "main.py", line 86, in <module>
    prepro(config)
  File "/extradisk/jitianbo/workspace/HotpotQA/prepro.py", line 330, in prepro
    word_emb_mat, word2idx_dict, idx2word_dict = get_embedding(word_counter, "word", emb_file=config.glove_word_file,
UnboundLocalError: local variable 'word_counter' referenced before assignment

How should I fix it?

kimiyoung commented 6 years ago

Most likely you are using the wrong command. Please follow the instructions in README. You need to process the training set before the dev set. I will add a new commit to include this change.

Arjunsankarlal commented 5 years ago

I also got the same error and I followed the steps correctly as it was mentioned. But later I found that moving the line 321 to 319 in file prepro.py fixes the issue. This reason of the error was not with the steps followed. As it was clearly mentioned in the error, a variable 'word_counter' was referenced before it was declared itself.

kimiyoung commented 5 years ago

I guess there is a tricky thing here. Using an empty word_counter will cause troubles, and this error actually acts as some kind of assertions. The logic is as follows: for the training set, the word_counter will be created and the word2idx_dict will be saved to files; for the dev set, you need to read the word2idx_dict from the preprocessed files, rather than creating a new one. If the branch at Line 330 is executed for the dev set, it means something is wrong --- the word2idx_file does not exist when you process the dev set. I think this error shouldn't occur if you process the training set before the dev set and test set.

Jasperty commented 5 years ago

when processing train data, i find that i use lots of memory, nearly 100%, so my speed is very very slow for more than 100 hours