Thank you for your excellent work. Could you please disclose how the metrics for each task are calculated?
Below are my code and results for evaluating region caption performance using the weights from geochat-7B, but the results are quite different from Table 10 in the paper. Where is the problem? Thank you
Thank you for your excellent work. Could you please disclose how the metrics for each task are calculated? Below are my code and results for evaluating region caption performance using the weights from geochat-7B, but the results are quite different from Table 10 in the paper. Where is the problem? Thank you
The results are as follows: