Don't re-calculate independent parameters

BenWoodworth commented 10 months ago

Currently an argument iterator is re-calculated when a previous parameter is iterated:

parameterize {
    val parameter1 by parameter {
        println("Calculated parameter1")
        listOf('a', 'b')
    }

    val parameter2 by parameter {
        println("Calculated parameter2")
        listOf(1, 2)
    }

    println("$parameter1$parameter2")
}

Will print:

Calculated parameter1
Calculated parameter2
a1
a2
Calculated parameter2
b1
b2

Here parameter2 is being calculated more than once, which may not be a lot here, but calculating once for every combination parameters before it could easily become tens of thousands of recalculations when only one is necessary.

It behaves this way because parameterize cautiously assumes that iterating a parameter (parameter1) will require future parameters iterators to be re-calculated (e.g. if parameter2 used parameter1)

It's possible to track when parameters are declared and used, so it should be possible to see that parameter2 is declared without parameter1 being used, so parameter2 does not need to be re-calculated after iterating parameter1's argument to 'b'.

BenWoodworth commented 10 months ago

In v0.1, there are 3 ways parameterize can probe parameter usage to determine independence:

When a parameter is first Declared
- (when Parameter.provideDelegate() is invoked)
When a parameter's argument is first Calculated
- (when ParameterDelegate.getValue() is invoked, and thus arguments.iterator())
When a parameter's argument is first Usable
- (when ParameterDelegate.getValue() returns after calculating it)

With these in mind, let:

_D indicate the parameter being declared,
_C being calculated, and
_U becoming usable

Given these, there is only a limited number of ways parameterize can observe the order of these events to determine if a parameter is independent, in order for it to be optimized.

[aD, bD, aC, bC, bU, aU]
```kotlin parameterize { lateinit var getB: () -> Int val a by parameter { listOf(getB()) } // depends on b val b by parameterOf(1) getB = { b } a } ```
[aD, bD, aC, aU, bC, bU]
```kotlin parameterize { var aValue = -1 val a by parameterOf(1) val b by parameter { listOf(aValue) } // depends on a aValue = a b } ```
[aD, bD, bC, aC, aU, bU]
```kotlin parameterize { val a by parameterOf(1) val b by parameter { a } // depends on a b } ```
[aD, bD, bC, bU, aC, aU]
```kotlin parameterize { var bValue = -1 val a by parameter { listOf(bValue) } // depends on b val b by parameterOf(1) bValue = b a } ```
[aD, aC, aU, bD, bC, bU]
```kotlin parameterize { val a by parameterOf(1) val b by parameterOf(a) // depends on a b } ```

Deriving code

```kotlin fun List.permutations(): Sequence> = if (size == 1) { sequenceOf(this) } else { this.asSequence().flatMapIndexed { i, first -> val tail = subList(0, i) + subList(i + 1, size) tail.permutations().map { tailPermutation -> listOf(first) + tailPermutation } } } // If the elements appear in order, even with others separating them fun List.inOrder(vararg elements: T): Boolean = elements.asList() .zipWithNext() .all { (a, b) -> indexOf(a) < indexOf(b) } listOf("aD", "bD", "aC", "bC", "aU", "bU").permutations() .filter { it.inOrder("aD", "bD") } // a is declared before b .filter { it.inOrder("aD", "aC", "aU") && it.inOrder("bD", "bC", "bU") } // declare -> calculate -> use .filterNot { it.inOrder("aC", "bC", "aU", "bU") } // impossible for b's getValue() to start during a's, then end after .filterNot { it.inOrder("bC", "aC", "bU", "aU") } // ^ and vice versa ```

However, for every one of these possibilities, it is possible for one parameter to be dependent on the other (with examples under each if you expand them). This means that with the behavior in v0.1, it is impossible for parameterize to determine that two parameters are independent, making it impossible optimization without changing the lazy argument delayed calculation behavior.

BenWoodworth commented 10 months ago

Now instead, consider how it could work if arguments were immediately calculated upon Declaration.

Probing when the argument is calculated is no longer part of getValue(). And, in fact, the calculation would not even need to be considered as part of the declaration, since, the same as any other tracked parameter Usages, those in the calculation would occur before provideDelegate() returns.

This means all that needs to be considered is declaration (provideDelegate()) and when a parameter is first used (getValue()):

This gives us 3 possibilities:

[aD, bD, aU, bU]
[aD, bD, bU, aU]
[aD, aU, bD, bU]

Deriving code

```kotlin fun List.permutations(): Sequence> = if (size == 1) { sequenceOf(this) } else { this.asSequence().flatMapIndexed { i, first -> val tail = subList(0, i) + subList(i + 1, size) tail.permutations().map { tailPermutation -> listOf(first) + tailPermutation } } } // If the elements appear in order, even with others separating them fun List.inOrder(vararg elements: T): Boolean = elements.asList() .zipWithNext() .all { (a, b) -> indexOf(a) < indexOf(b) } listOf("aD", "bD", "aU", "bU").permutations() .filter { it.inOrder("aD", "bD") } // a is declared before b .filter { it.inOrder("aD", "aU") && it.inOrder("bD", "bU") } // declare -> use ```

So, in the first two cases, a and b must be independent, since they are both declared with their arguments calculated before either exposes its argument to possibly be used by the other. And as for 3, although a and b could be independent, it's impossible to track whether a's value ends up affecting how b is calculated.

BenWoodworth commented 10 months ago

There is one caveat, though I'm not sure how much of an issue it'd be. I can think of cases where an iterator postpones its reading of values from the parameterize scope, and thus produces a different second argument based on a later parameter's argument. For example:

parameterize {
    var bValue = -1

    val a by parameter(
        sequence {
            yield(1)
            yield(bValue)
        }.asIterable()
    )

    val b by parameterOf(2, 3)

    bValue = b
}

Here, a's second argument will be 2 because it takes from b's argument. I don't think this should be an issue though, since I don't think it's dependent in a way that'll invalidate a's arguments when b changes. Here, bValue is captured from a past scope. If we can show that a's iterator will produce different arguments a second time through, that would be enough to demonstrate that optimizing and re-using the iterator isn't possible. But especially since iterators having side effects is specifically not supported, I'm not sure if this should be a worry. Especially with how contrived it is, and likely even more contrived to break a "re-use the arguments iterable" optimization.

BenWoodworth / Parameterize

Don't re-calculate independent parameters #3