cdhx / QueryAgent

Code and data for QueryAgent(ACL 2024)
17 stars 0 forks source link

How do you transform raw observations to `guidelines`? #1

Closed zhoujz10 closed 6 months ago

zhoujz10 commented 6 months ago

Hi, your paper says ERASER distinguishes between different types of errors, allowing it to provide guidelines specifically tailored for each error type.

Could you please clarify how you accomplished such transformations from raw error messages to guidelines? Using LLM + ICL or rule-based methods? Thanks!

cdhx commented 6 months ago

Thank you for your interest in our work~

Transformation strategy

The transformation strategy is manually designed(or you can call it a rule-based method)

In specific, the example can be found in Section 3.3 and Appendix G.

Here we would like to clarify some confusion, we are not based on the "raw error messages". We will not wait for the environment to throw an error but perform detection by parsing or other methods to detect the error. Since the observation(self.obs) is the guideline when an error occurs, we put the guideline in the content of "raise error". The outer function will catch this error and use the content in the error as the observation of this step (guideline), this error content is the guideline we designed. We design various guidelines in the proper place for different errors to distinguish them. You may understand with this piece of example:

        if action == 'add_count':
            if len(params) != 2:
                self.obs = f'add_count(count_var,new_var) should have 2 parameters. You have {len(params)} parameters. Please check again.'
                raise ValueError(self.obs)
            if params[0] not in self.pyql.var:
                self.obs = f'The first parameter in add_count must be a existing variable, but you used {params[0]}. Existing variables includes: {self.pyql.var}. Please choose proper variable and set again.'
                raise ValueError(self.obs)
            if not params[1].startswith('?'):
                self.obs = f'The second parameter in add_count must be a variable that starts with ?, but you used {params[1]}. Please check again.'
                raise ValueError(self.obs)

Why we mention "raw error messages"

It is worth noting that the aim we mentioning "raw error messages" is to compare with the work that returns the "raw error messages" to LLM for self-correction. Compared with other methods, our method has this feature: "LLM is only aware of guidelines, compare with related work, our method ‘shields the LLM from directly considering the original error".

However, you can also implement based on 'raw error messages' with regular expressions rather than parsing but it would be a little more complex. It is just a difference in implementation.

Why we have not use ICL-based method

We use the rule-based method because it is lightweight and precise. Besides, it does not require much experience. In fact, using LLM+ICL to conduct this transformation is still using the ability of "intrinsic self-correction'', which needs the LLM to directly face the raw error. Although the LLM has some ability to conduct intrinsic self-correction, many works also claim that LLM may not have enough ability to conduct zero-shot self-correction without any explicit guidance.

Large language models cannot self-correct reasoning yet(ICLR 2024).

Besides, as we said in the paper, the ICL-based method can only cover limited cases. It also leads to lengthy prompts and leaves a burden on LLM to understand which cases are most related.

More importantly, ICL-based method also need manully written examples, this part of effort is hard to unavoidable.

Considering the above issues, we implement this in a rule-based manner.

The aim of ERASER is to provide a conceptual framework for detecting and distinct errors based on environmental feedback so that different errors could receive more specific treatments, not limited to our implementation or KBQA task. That means you can also design your own strategy (maybe LLM+ICL) based on the task or scenario you meet. Despite that, the implementation in this paper has already demonstrated the effectiveness of this framework.

zhoujz10 commented 6 months ago

Hi @cdhx Thanks for your detailed explanation and the valuable insights! Looking forward to your codes!

cdhx commented 6 months ago

I am glad to be able to address your questions~

We are working tirelessly to release the code. Please stay tuned!