lihuiliullh / NewLook

8 stars 3 forks source link

CUDA out of memory #3

Open cx51666666 opened 2 years ago

cx51666666 commented 2 years ago

The graphics card I used was 2080Ti, but I had a memory overflow problem. Command: CUDA_VISIBLE_DEVICES=1 Python -U codes/ run_model_newlook. py --do_train -- CUDa -- do_VALID --do_test --data_pathData /FB15k --model BoxNewLook -n 128-b 256 -d 400 -g 24 -a 1.0 -lr 0.0001 --max_steps 50000 -- CPU_num 1- 16-0.02 - geo center_reg test_batch_size box - task 1 c. c. 2 3 c. i. 2 3 i. IC. Ci. 2 u. Uc. 2 d. 3 d. Dc -- stepsforpath 50000-- Areopagivarearev --print_on_screen. Looking forward to your reply, thank you

lihuiliullh commented 2 years ago

You can try to use small group_size and group_time to generate matrix.pkl and one_hot_vector.pkl. Try to use group_size=200 and group_times=5. If it doesn't work, you can use other parameters.

image

cx51666666 commented 2 years ago

According to your suggestion, the parameters in gen_trans_matrix_singLE_kg_triple and gen_trans_matrix_MULTI_from_kg_triple were modified to group_size=200 and group_times = 5.But the following error occurs File "/data/beifen/NewLook/codes/model_NewLook.py", line 466, in forward if multi_hot1 != None: TypeError: ne() received an invalid combination of arguments - got (NoneType), but expected one of:

lihuiliullh commented 2 years ago

try to use group_size=300 and group_times=5. Make sure both gen_trans_matrix_multi_from_kg_triple.py and gen_trans_matrix_single_kg_triple.py should be changed.

cx51666666 commented 2 years ago

The same error occurs when group_size=300 and group_times=5. All four files are regenerated. How should I set the parameters? Thank you very much!

image File "/data/beifen/NewLook/codes/model_NewLook.py", line 466, in forward if multi_hot1 != None: TypeError: ne() received an invalid combination of arguments - got (NoneType), but expected one of:

lihuiliullh commented 2 years ago

I didn't have any problem using group_size=300 and group_times=5. Are you sure you modified both gen_trans_matrix_multi_from_kg_triple.py and gen_trans_matrix_single_kg_triple.py and copied the generated data to the right directory? Can you past the last part of the log here?

cx51666666 commented 2 years ago

The operation process is as follows: Run gen_trans_matrix_single_kg_trix. py and gen_trans_matrix_multi_from_kg_trix. py under NewLook/codes/trans_matrix_gen Set the parameters to: Gen_trans_matrix_multi_from_kg_triple. Py: group_size = 300, group_times = 5 Gen_trans_matrix_single_kg_triple. Py: group_size = 300 Four. PKL files were obtained, with sizes of 171.1m, 34.22m, 9.02g and 1.8g respectively. Then four files to 'mobile/home/caoxing/project/NewLook/data/FB15k' directory. Run the command: CUDA_VISIBLE_DEVICES=0 Python -u codes/ run_model_newlook. py --do_train -- CUDa --do_valid --do_test \ --data_path data/FB15k --model BoxNewLook -n 128-b 512 -d 400 -g 24 -a 1.0 \ -lr 0.0001 --max_steps 50000 -- CPU_num 1 --test_batch_size 16 --center_reg 0.02 \

Repeat the above steps and the same error occurs

cx51666666 commented 2 years ago

1636595002(1)

lihuiliullh commented 2 years ago

did you change the file path in gen_trans_matrix_single_kg_trix. py and gen_trans_matrix_multi_from_kg_trix. py?

image

cx51666666 commented 2 years ago

The path I used is as follows and has not changed When running on dataset FB15K, I thought the path was correct, so I didn't change it

image

cx51666666 commented 2 years ago

1636720317(1)

lihuiliullh commented 2 years ago

I use parameters group_size=200 and group_times=4. There is no error. Maybe it the problem of your pytorch version or OS.

cx51666666 commented 2 years ago

Thanks for your reply, how I can generate files like trian_triples_1c.pkl, trian_triples_2c.pkl. Is it possible that you release the data processing code? Thanks!