HuskyInSalt / CRAG

Corrective Retrieval Augmented Generation
298 stars 29 forks source link

Which decompose modes/parameters were used for the experiments? #10

Closed xiao-wevnal closed 7 months ago

xiao-wevnal commented 7 months ago

The Appendix B.2 - Internal Knowledge section of the paper

          [...] If a retrieved result is as short as
one or two sentences, it is regarded as an individual
strip, otherwise, retrieval documents are required to
be split into smaller units which generally consist
of a few sentences according to the total length.

combined with what you said in readme.md

'fixed_num' segments passages into a fixed number of words,
'excerption' segments passages based on the end of the sentences

seems to suggest that you have used excerption as the decompose mode, while the code here defaults to selection.

Which one did you use in your experiments, and what parameters did you use, such as how many sentences / characters / words / tokens per strip, for which dataset?

Also if you've tested changing those numbers, I'm curious what were the effects?

HuskyInSalt commented 7 months ago

Hi @xiao-wevnal , for excerption mode in our experiments, the naive judgment on the end of a sentence is based on marks like period, exclamation mark, etc. We did not explore a more reliable method since it is not our main work, which resulted in a lot of mistakes (such as Mr., 2.13).

In this situation, selection is a simple but useful approach as a replacement, and we use it in our experiments. Actually, results show that they have the similar performance, thus we suggest that you can also try excerption if you have a better sentences segmenting method. Btw, the parameters are all available in the released code. Your suggestion is valuable, more experiments on the effects of changing them are definitely helpful.

jinyangwu commented 7 months ago

Hi, @HuskyInSalt, I would like to ask for the similar questions. Actually, i didn't obtain the same results as mentioned in the paper (even the baselines), and i found that the temperature, top_p and batch_size influences the final performance deeply.

So would you please provide the final code or parameters setting for your paper results? Thanks.

HuskyInSalt commented 7 months ago

Hi @jinyangwu , I have learned your questions and received your email. Actually, there was a mistake in our code, resulting in a wrong reproduction of the baseline, which was fixed. You can check your email for a more specific response according to the concerns mailed to us.

xiao-wevnal commented 7 months ago

Hi @xiao-wevnal , for excerption mode in our experiments, the naive judgment on the end of a sentence is based on marks like period, exclamation mark, etc. We did not explore a more reliable method since it is not our main work, which resulted in a lot of mistakes (such as Mr., 2.13).

In this situation, selection is a simple but useful approach as a replacement, and we use it in our experiments. Actually, results show that they have the similar performance, thus we suggest that you can also try excerption if you have a better sentences segmenting method. Btw, the parameters are all available in the released code. Your suggestion is valuable, more experiments on the effects of changing them are definitely helpful.

Thanks for the insight. It's interesting to know the practical problems that can arise from a naive way to decompose a given document.