Preprocessing step - memory consumption

Hello,

I am trying to get the intial repo to work by following the steps provided. In the preprocessing step, upon running:

python main.py --mode prepro --data_file hotpot_train_v1.1.json --para_limit 2250 --data_split train The pre-processing begins. However, the memory consumption on my machine becomes enormous (using up to 10gb), so I decided to terminate it. How many 'tasks' (as it displays while running) does the preprocessing have to go through?

Is this amount of memory consumption normal or is something wrong with my setup/enviornment?

Thanks

hotpotqa / hotpot

Preprocessing step - memory consumption #31