srhthu / LM-CompEval-Legal

Code for the paper "A Comprehensive Evaluation of Large Language Models on Legal Judgment Prediction"
10 stars 1 forks source link

Information retrieval #2

Open Tizzzzy opened 8 months ago

Tizzzzy commented 8 months ago

Hi, Super excited about your work. I am trying to understand your code, can you help me point out which file is related to the information retrieval part?

srhthu commented 8 months ago

Hi, we use the BM25 algorithm to retrieve similar cases based on case descriptions. The retrieval code is not included in the repository. You can access the retrieved data by downloading it

bash download_data.sh
# Download evaluation dataset to data_hub/ljp

Under the data_hub/ljp folder, there are test_data.json and train_data.json. Each test sample has a key sim_demo_idx that contains a list of indexes of similar cases of train_data.json.

Yyy11181 commented 1 month ago

so how to use the the BM25 algorithm to retrieve similar cases based on case descriptions?