Open Tizzzzy opened 8 months ago
Hi, we use the BM25 algorithm to retrieve similar cases based on case descriptions. The retrieval code is not included in the repository. You can access the retrieved data by downloading it
bash download_data.sh
# Download evaluation dataset to data_hub/ljp
Under the data_hub/ljp
folder, there are test_data.json
and train_data.json
. Each test sample has a key sim_demo_idx
that contains a list of indexes of similar cases of train_data.json
.
so how to use the the BM25 algorithm to retrieve similar cases based on case descriptions?
Hi, Super excited about your work. I am trying to understand your code, can you help me point out which file is related to the information retrieval part?