Question about hyperparameters

VITA-Group / LLaGA

[ICML2024] "LLaGA: Large Language and Graph Assistant", Runjin Chen, Tong Zhao, Ajay Jaiswal, Neil Shah, Zhangyang Wang

Apache License 2.0

81 stars 3 forks source link

Question about hyperparameters #3

Closed hyworrywart closed 8 months ago

hyworrywart commented 8 months ago

Hi,

Thank you for your great work.

Could you please tell me the hyperparameters you use in NC tasks on Cora with LLAGA-ND-7B model and SINGLE FOCUS model type? I try to reproduce the results using default parameters in train.sh but I failed.

I run train.sh by: CUDA_VISIBLE_DEVICES=0 ./scripts/train.sh vicuna nc cora.3 16 1

ChenRunjin commented 8 months ago

This is exactly the hyperparameters we are using, I just run the command for another time, and then run eval by command

CUDA_VISIBLE_DEVICES=0 python eval/eval_pretrain.py --model_path checkpoints/cora.3/llaga-vicuna-7b-simteg-2-10-linear-projector_nc/ --model_base lmsys/vicuna-7b-v1.5-16k --conv_mode v1 --pretrained_embedding_type simteg --use_hop 2 --sample_neighbor_size 10 --template ND --answers_file ./cora_nc.jsonl --dataset cora --task nc

then get the result 0.8948. The result is very close to the 0.8886 in the paper, do you have any problem of getting results in other datasets like pubmed, I want to make sure whether there is some environment issue?

hyworrywart commented 8 months ago

This is exactly the hyperparameters we are using, I just run the command for another time, and then run eval by command

CUDA_VISIBLE_DEVICES=0 python eval/eval_pretrain.py --model_path checkpoints/cora.3/llaga-vicuna-7b-simteg-2-10-linear-projector_nc/ --model_base lmsys/vicuna-7b-v1.5-16k --conv_mode v1 --pretrained_embedding_type simteg --use_hop 2 --sample_neighbor_size 10 --template ND --answers_file ./cora_nc.jsonl --dataset cora --task nc

then get the result 0.8948. The result is very close to the 0.8886 in the paper, do you have any problem of getting results in other datasets like pubmed, I want to make sure whether there is some environment issue?

Oh I run eval with conv_mode = 'llaga_llama_2' by mistake. Now my result is close to 0.8886. Thanks for your reply.

AGTSAAA commented 8 months ago

Could you also please tell me the hyperparameters you use in NC tasks on ogbn-arxiv with LLAGA-ND-7B model (SINGLE FOCUS)?

I use ./scripts/train_deepspeed.sh vicuna nc arxiv 4 1, but only get about 74.82

Thanks!

ChenRunjin commented 8 months ago

I trained this setting on single GPU with batch size 16, i.e. CUDA_VISIBLE_DEVICES=0 ./scripts/train.sh vicuna nc arxiv 16 1 I didn't try deepspeed on single focus setting before,maybe you can try to adjust the batchsize per GPU to 8 or 16 like ./scripts/train_deepspeed.sh vicuna nc arxiv 16 1? (I'm also not sure here, I think it may be also related to how many GPUs you are using. In other larger settings, I usually make batch_size per GPU * GPU num =16, I didn't tune the the hyperparameters too much, but I found this hyperparameter can always work.)

AGTSAAA commented 8 months ago

Hi, I utilized CUDA_VISIBLE_DEVICES=0 ./scripts/train.sh vicuna nc arxiv 16 1, but I achieved the following performance

ChenRunjin commented 8 months ago

Hi, it appears that you're utilizing batch evaluation in the naive batch branch. Could you please attempt using batch size 1 evaluation in the master branch? I developed the naive batch branch primarily for efficiency purposes and did not verify its impact on performance.

AGTSAAA commented 8 months ago

I tried the batch size 1 evaluation in the master branch, but still achieved about 74 acc

ChenRunjin commented 8 months ago

I'm not sure about the issue with your setup. because I didn't meet this situation before. From my perspective, I utilized a batch size of 16 and the default lr specified in the script. In the afternoon, I conducted evaluation on the arxiv single focus model using both batch and single evaluation scripts, all achieving an accuracy close to 0.76. Are you using the recommended environment, and do you have any modifications on our code?

I just uploaded my arxiv single focus model to Hugging Face: https://huggingface.co/Runjin/llaga-vicuna-7b-ND-arxiv-nc/tree/main. You can test it by replacing your current model path with the Hugging Face repository name. All our test results can be downloaded through the link I provided in README.