IndexError: list index out of range

myx666 / LeCaRD

A Chinese legal case retrieval dataset.

MIT License

117 stars 15 forks source link

Closed aixuedegege closed 3 years ago

aixuedegege commented 3 years ago

metric 跑不起来

myx666 commented 3 years ago

具体是用哪个结果，在什么指标上跑不起来

aixuedegege commented 3 years ago

这样跑的 python metrics.py --m NDCG --label label/golden_labels.json --pred prediction --q all

而且lawformer那个路径写死的

myx666 commented 3 years ago

--label 用label_top30_dict.json。golden_labels.json只含有label=3的candidate id，不能用于计算NDCG。 lawformer那个是我用于调试的，稍后我注释掉

aixuedegege commented 3 years ago

metrics.py 中 logi = math.log(i+2,2) 实现的ndcg有算法介绍么比如相关论文

myx666 commented 3 years ago

ndcg的定义与实现网上有很多，我猜你是想问为什么i+2不是i+1？这是因为代码中i是从0开始计数的

aixuedegege commented 3 years ago

哦哦好的 thks。还有 combined_top100.json 这个里面是什么我看metric连133行用到了这个，这个为什么要判断一下。

myx666 commented 3 years ago

这个算是历史遗留问题，解释起来比较复杂..简单来说就是以前的label格式设计得不太好（但是不影响结果）。刚刚我把metrics的NDCG代码更新了，这样应该更易懂一些，感谢提醒。

aixuedegege commented 3 years ago

了解了多谢