Hello,
While using the LATS code, I encountered what seems to be a bug and I have a question.
First, in the evaluate_node function, it seems that value prompting should only be performed for children that are not terminal, and a value of 0 should be assigned to terminal children. However, I suspect there is a bug, as shown in the example below.
Specifically, it seems that values are being computed for Lookup[slightly soluble in] and Lookup[soluble in alcohol], but these values are then being assigned to an action Finish[alcohol, CdCl2] node as in following example. Could you please check this?
generated actions
Cadmium Chloride is slightly soluble in alcohol and is also known as CdCl2. I need to confirm this information and provide the answer.
Action 2: Finish[alcohol, CdCl2]
Cadmium Chloride is slightly soluble in alcohol and also known as CdCl2. I need to search for what Cadmium Chloride is slightly soluble in and its other name.
Action 2: Finish[alcohol, CdCl2]
Cadmium Chloride is slightly soluble in alcohol and also known as CdCl2. I still need to confirm the solubility in alcohol and search for the other name.
Action 2: Lookup[slightly soluble in]
Cadmium chloride is slightly soluble in alcohol and also known as CdCl2. I need to confirm this information before providing the answer.
Action 2: Lookup[soluble in alcohol]
Cadmium Chloride is slightly soluble in alcohol and also known as CdCl2. I need to search for what Cadmium Chloride is slightly soluble in and its other name.
Action 2: Finish[alcohol, CdCl2]`
(bold faced actions are prompted for value prompting)
votes: [0.2, 1.0, 0, 0]
updated values of child nodes
Node(depth=2, value=0.20, visits=0, thought=Cadmium Chloride is slightly soluble in alcohol and is also known as CdCl2. I need to confirm this information and provide the answer., action=Finish[alcohol, CdCl2], observation=Episode finished, reward = 0)
Node(depth=2, value=1.00, visits=0, thought=Cadmium Chloride is slightly soluble in alcohol and also known as CdCl2. I need to search for what Cadmium Chloride is slightly soluble in and its other name., action=Finish[alcohol, CdCl2], observation=Episode finished, reward = 0)
Node(depth=2, value=0.00, visits=0, thought=Cadmium Chloride is slightly soluble in alcohol and also known as CdCl2. I still need to confirm the solubility in alcohol and search for the other name., action=Lookup[slightly soluble in], observation=No more results.)
Node(depth=2, value=0.00, visits=0, thought=Cadmium chloride is slightly soluble in alcohol and also known as CdCl2. I need to confirm this information before providing the answer., action=Lookup[soluble in alcohol], observation=(Result 1 / 1) This salt is a hygroscopic solid that is highly soluble in water and slightly soluble in alcohol.)
.
+++
Second, I am unable to identify where in the code the self-consistency reward and the hyperparameter lambda, which is multiplied during the value function calculation, are implemented. Could you please explain this part?
Hello, While using the LATS code, I encountered what seems to be a bug and I have a question.
First, in the
evaluate_node
function, it seems that value prompting should only be performed for children that are not terminal, and a value of 0 should be assigned to terminal children. However, I suspect there is a bug, as shown in the example below. Specifically, it seems that values are being computed forLookup[slightly soluble in]
andLookup[soluble in alcohol]
, but these values are then being assigned to an actionFinish[alcohol, CdCl2]
node as in following example. Could you please check this?generated actions Cadmium Chloride is slightly soluble in alcohol and is also known as CdCl2. I need to confirm this information and provide the answer. Action 2: Finish[alcohol, CdCl2]
Cadmium Chloride is slightly soluble in alcohol and also known as CdCl2. I need to search for what Cadmium Chloride is slightly soluble in and its other name. Action 2: Finish[alcohol, CdCl2]
Cadmium Chloride is slightly soluble in alcohol and also known as CdCl2. I still need to confirm the solubility in alcohol and search for the other name. Action 2: Lookup[slightly soluble in]
Cadmium chloride is slightly soluble in alcohol and also known as CdCl2. I need to confirm this information before providing the answer. Action 2: Lookup[soluble in alcohol]
Cadmium Chloride is slightly soluble in alcohol and also known as CdCl2. I need to search for what Cadmium Chloride is slightly soluble in and its other name. Action 2: Finish[alcohol, CdCl2]` (bold faced actions are prompted for value prompting)
votes:
[0.2, 1.0, 0, 0]
updated values of child nodes Node(depth=2, value=0.20, visits=0, thought=Cadmium Chloride is slightly soluble in alcohol and is also known as CdCl2. I need to confirm this information and provide the answer., action=Finish[alcohol, CdCl2], observation=Episode finished, reward = 0)
Node(depth=2, value=1.00, visits=0, thought=Cadmium Chloride is slightly soluble in alcohol and also known as CdCl2. I need to search for what Cadmium Chloride is slightly soluble in and its other name., action=Finish[alcohol, CdCl2], observation=Episode finished, reward = 0)
Node(depth=2, value=0.00, visits=0, thought=Cadmium Chloride is slightly soluble in alcohol and also known as CdCl2. I still need to confirm the solubility in alcohol and search for the other name., action=Lookup[slightly soluble in], observation=No more results.)
Node(depth=2, value=0.00, visits=0, thought=Cadmium chloride is slightly soluble in alcohol and also known as CdCl2. I need to confirm this information before providing the answer., action=Lookup[soluble in alcohol], observation=(Result 1 / 1) This salt is a hygroscopic solid that is highly soluble in water and slightly soluble in alcohol.)
.
+++ Second, I am unable to identify where in the code the self-consistency reward and the hyperparameter lambda, which is multiplied during the value function calculation, are implemented. Could you please explain this part?
Thanks!