axone-protocol / axoned

ā›“ļø Axone blockchain šŸ’«
https://axone.xyz
Apache License 2.0
161 stars 121 forks source link

šŸ§  Logic: Non-determinism of `NewVariable` in prolog interpreter #690

Closed bdeneux closed 1 week ago

bdeneux commented 1 month ago

When the functor/3 predicate is called multiple times, it returns different results. This is demonstrated in the following example:

āžœ  ~ axoned query logic ask "functor(Structure,fish,4)." && axoned query logic ask "functor(Structure,fish,4)."
answer:
  has_more: false
  results:
  - error: ""
    substitutions:
    - expression: fish(_63,_64,_65,_66)
      variable: Structure
  variables:
  - Structure
gas_used: "2216"
height: "3721"
user_output: ""
answer:
  has_more: false
  results:
  - error: ""
    substitutions:
    - expression: fish(_128,_129,_130,_131)
      variable: Structure
  variables:
  - Structure
gas_used: "2216"
height: "3721"
user_output: ""

The inconsistency arises because functor/3 uses the built-in NewVariable() function to allocate a new variable for each term's arity. This function increments a global counter each time a new variable is created. However, this counter is stored in the node's memory and is not synchronized across nodes. As a result, the counter can be incremented on one node and then called again through block validation on another node, leading to inconsistent results.

https://github.com/axone-protocol/prolog/blob/d6ea3496c94deb1dbb755c4e2c78914507c67d0b/engine/builtin.go#L291-L293

https://github.com/axone-protocol/prolog/blob/d6ea3496c94deb1dbb755c4e2c78914507c67d0b/engine/variable.go#L9-L23

var varCounter int64

func lastVariable() Variable {
        return Variable(varCounter)
}

// Variable is a prolog variable.
type Variable int64

// NewVariable creates a new anonymous variable.
func NewVariable() Variable {
        n := atomic.AddInt64(&varCounter, 1)
        return Variable(n)
}

This issue is not limited to the functor/3 predicate. It also affects other predicates that allocate new variables, such as length/2, and when variables are displayed as result solutions.

āžœ  ~ axoned query logic ask "length(List, 10)."
answer:
  has_more: false
  results:
  - error: ""
    substitutions:
    - expression: '[_258,_259,_260,_261,_262,_263,_264,_265,_266,_267]'
      variable: List
  variables:
  - List
gas_used: "2216"
height: "4208"
user_output: ""
āžœ  ~ axoned query logic ask "length(List, 10)."
answer:
  has_more: false
  results:
  - error: ""
    substitutions:
    - expression: '[_331,_332,_333,_334,_335,_336,_337,_338,_339,_340]'
      variable: List
  variables:
  - List
gas_used: "2216"
height: "4209"
user_output: ""

Proposed solution(s)

To avoid a complete engine refactor, I propose adding a new function to reset the counter. This function should be called each time a new interpreter is instantiated. Since each GRPC query or execution instantiates a new interpreter, all nodes should align with the same counter count, assuming each internal engine call is deterministic.

func ResetVariableCounter() {
    atomic.StoreInt64(&varCounter, 0)
}

It should result with

āžœ  ~ axoned query logic ask "functor(Structure,fish,4)." && axoned query logic ask "functor(Structure,fish,4)."
answer:
  has_more: false
  results:
  - error: ""
    substitutions:
    - expression: fish(_63,_64,_65,_66)
      variable: Structure
  variables:
  - Structure
gas_used: "2216"
height: "3721"
user_output: ""
answer:
  has_more: false
  results:
  - error: ""
    substitutions:
    - expression: fish(_63,_64,_65,_66)
      variable: Structure
  variables:
  - Structure
gas_used: "2216"
height: "3721"
user_output: ""

āš ļø Limits

While investigating the implementation of the functor/3 predicate, I noticed a code snippet that could lead to high memory consumption and potentially halt the node since it's the same used in the length/2 predicate : https://github.com/axone-protocol/axoned/issues/618#issuecomment-2152648835. If a large arity count is provided, as in the following example, a loop is executed that calls NewVariable() for each arity, which can lead to a node crash.

āžœ  ~ axoned query logic ask "functor(Structure,fish,4000000000000)."
amimart commented 1 month ago

@bdeneux Thanks for this complete analysis! šŸ™

The proposed solution seems ideal to me assuming the order of variable creation is deterministic as you mentioned, should be the case from what AFAIS but a deeper look should be welcomed.

Regarding the implementation If I understand well we need to expose this varCounter right ? This is a bit ugly but I think we can go this way at first and design a cleaner way later in the perspective of integrating our changes on the upstream repository.

bdeneux commented 1 month ago

@amimart, I agree that the current workaround is not the most elegant solution. I consider two possibles approaches to address the variable counter management issue:

  1. Introducing a Global ResetCounter Function or Exposing the varCounter Directly: This approach would delegate the responsibility of resetting the counter to the caller of the interpreter. While this might seem like a straightforward solution, it does raise concerns about the proper management of the counter's lifecycle. In our specific use case, this might not pose a significant issue since the interpreter is indeed reinstantiated with each query call. However, it's not the most robust or foolproof method for managing the counter's state.

  2. Embedding the Counter State within the Interpreter or VM Level: Ideally, incorporating the counter state directly into the interpreter or VM would ensure that the counter is automatically reset with each new instantiation of the interpreter. Implementing this solution would likely require a comprehensive refactoring effort, particularly to facilitate access to the VM whenever a new variable is created. This is especially challenging in test scenarios where a VM instance might not be readily accessible at the time NewVariable() is invoked.

As you've said, I think we can proceed with the first solution for now and reserve further investigation for a proper implementation later, one that includes a VM implementation more suited to our needs and addresses all other issues that need to be resolved.