Closed nmshafie1993 closed 2 years ago
Hi, And how is your RAM going? What is the total amount? Could you observe RAM memory during writing process? I had to write in batches on H2O machine to overcome limited memory of the machine. Disk space was never an issue. If RAM memory is an issue for you then it easy to just make 20 batches instead of 10.
Thank you. Is changing this line enough for making 20 batches instead of 10? is there anything else that needs to be changed?
You need to adjust subset of data as well, each iteration writes part of data into file. It is 2 lines below. Did you checked RAM memory usage?
Hi there I am trying to generate data with the Join data generation script. It works very well with e8 and e7 but it is not with e9. It only generates a 5.77 GB file named
J1_1e9_NA_0_0.csv
and gets stuck atWriting 1e9 data batch 2
which eventually gets killed. Here is the output:Generate join data of 1e9 rows
Producing keys for LHS and RHS data
Producing LHS 1e9 data from keys
Writing LHS 1e9 data J1_1e9_NA_0_0
Writing 1e9 data batch 1
Writing 1e9 data batch 2
I checked the storage and there is more than 250GB empty space and cleared the cache folder. I was able to generate all other datasets with the data generation script including groupby e9s. But I don't understand why this one does not work, do you have any solution for it? thanks