Closed hyworrywart closed 8 months ago
This is exactly the hyperparameters we are using, I just run the command for another time, and then run eval by command
CUDA_VISIBLE_DEVICES=0 python eval/eval_pretrain.py --model_path checkpoints/cora.3/llaga-vicuna-7b-simteg-2-10-linear-projector_nc/ --model_base lmsys/vicuna-7b-v1.5-16k --conv_mode v1 --pretrained_embedding_type simteg --use_hop 2 --sample_neighbor_size 10 --template ND --answers_file ./cora_nc.jsonl --dataset cora --task nc
then get the result 0.8948. The result is very close to the 0.8886 in the paper, do you have any problem of getting results in other datasets like pubmed, I want to make sure whether there is some environment issue?
This is exactly the hyperparameters we are using, I just run the command for another time, and then run eval by command
CUDA_VISIBLE_DEVICES=0 python eval/eval_pretrain.py --model_path checkpoints/cora.3/llaga-vicuna-7b-simteg-2-10-linear-projector_nc/ --model_base lmsys/vicuna-7b-v1.5-16k --conv_mode v1 --pretrained_embedding_type simteg --use_hop 2 --sample_neighbor_size 10 --template ND --answers_file ./cora_nc.jsonl --dataset cora --task nc
then get the result 0.8948. The result is very close to the 0.8886 in the paper, do you have any problem of getting results in other datasets like pubmed, I want to make sure whether there is some environment issue?
Oh I run eval with conv_mode = 'llaga_llama_2' by mistake. Now my result is close to 0.8886. Thanks for your reply.
Hi
Could you also please tell me the hyperparameters you use in NC tasks on ogbn-arxiv with LLAGA-ND-7B model (SINGLE FOCUS)?
I use ./scripts/train_deepspeed.sh vicuna nc arxiv 4 1, but only get about 74.82
Thanks!
I trained this setting on single GPU with batch size 16, i.e. CUDA_VISIBLE_DEVICES=0 ./scripts/train.sh vicuna nc arxiv 16 1 I didn't try deepspeed on single focus setting before,maybe you can try to adjust the batchsize per GPU to 8 or 16 like ./scripts/train_deepspeed.sh vicuna nc arxiv 16 1? (I'm also not sure here, I think it may be also related to how many GPUs you are using. In other larger settings, I usually make batch_size per GPU * GPU num =16, I didn't tune the the hyperparameters too much, but I found this hyperparameter can always work.)
Hi, I utilized CUDA_VISIBLE_DEVICES=0 ./scripts/train.sh vicuna nc arxiv 16 1, but I achieved the following performance
Hi, it appears that you're utilizing batch evaluation in the naive batch branch. Could you please attempt using batch size 1 evaluation in the master branch? I developed the naive batch branch primarily for efficiency purposes and did not verify its impact on performance.
I tried the batch size 1 evaluation in the master branch, but still achieved about 74 acc
I'm not sure about the issue with your setup. because I didn't meet this situation before. From my perspective, I utilized a batch size of 16 and the default lr specified in the script. In the afternoon, I conducted evaluation on the arxiv single focus model using both batch and single evaluation scripts, all achieving an accuracy close to 0.76. Are you using the recommended environment, and do you have any modifications on our code?
I just uploaded my arxiv single focus model to Hugging Face: https://huggingface.co/Runjin/llaga-vicuna-7b-ND-arxiv-nc/tree/main. You can test it by replacing your current model path with the Hugging Face repository name. All our test results can be downloaded through the link I provided in README.
Hi,
Thank you for your great work.
Could you please tell me the hyperparameters you use in NC tasks on Cora with LLAGA-ND-7B model and SINGLE FOCUS model type? I try to reproduce the results using default parameters in train.sh but I failed.
I run train.sh by: CUDA_VISIBLE_DEVICES=0 ./scripts/train.sh vicuna nc cora.3 16 1