Closed wefwefWEF2 closed 1 month ago
Hi,
Thank you for your good question!
Best regards, Runsen
Thanks for your reply!!! And I try to evaluate the results in paper using your released weights,
python /eval/eval_objaverse.py --model_name /pointllm/weights/checkpoints_paper/PointLLM_7B_v1.2 --task_type classification --prompt_index 0 --data_path /data/Objaverse_colored_point_clouds/8192_npy --anno_path /data/instruction-following_data/PointLLM_brief_description_val_200_GT.json
python pointllm/eval/traditional_evaluator.py --results_path /pointllm/weights/checkpoints_paper/PointLLM_7B_v1.2/evaluation/PointLLM_brief_description_val_200_GT_Objaverse_classification_prompt0.json
results like 'Average BLEU-1 Score: 3.0461' is different from 3.87 in the paper, how to get the right results. Is it related to the --batch_size during inference?
Hi,
The model cannot generate exactly the same results each time, so it's normal the have a different number, as long as the deviation isn’t too much.
Thanks for the fantastic work.
I have some questions. I noticed that the Objaverse evaluation dataset only contains 200 in traditional test results, is that correct? Is the quantity too small for evaluation? and how to generate results for 3000 objects, thanks a lot!!