openfisca / openfisca-core

OpenFisca core engine. See other repositories for countries-specific code & data.
https://openfisca.org
GNU Affero General Public License v3.0
170 stars 75 forks source link

Add parameter values used in computations in tracer output #701

Closed guillett closed 6 years ago

guillett commented 6 years ago

Hi there!

I really enjoy OpenFisca, but here is a feature I would appreciate.

I investigate OpenFisca computations with http://localhost:9000/trace?situationId=588752fc3d4bb9bf7a656922

And I would love to have parameter values, in addition to variables values.

Context

I identify more as a:

sandcha commented 6 years ago

Context

With this simple example.json:

{
    "persons": {
        "Alicia": {
            "income_tax": {
                "2017-01": null
            }
        }
    },
    "households": {
        "_": {
            "parents": ["Alicia"]
        }
    }
}

The /trace on contry-model:

curl http://demo.openfisca.org/api/trace -H 'Content-Type: application/json' -d @example.json

Gives the following, (summarised) answer:

{
  "trace": {
    "income_tax<2017-01>": {
      "dependencies": [
        "salary<2017-01>"
      ], 
      "value": [
        0.0
      ]
    }, 
    "salary<2017-01>": {
      "dependencies": [], 
      "value": [
        0.0
      ]
    }
  }

Format

A list of formats come to mind (however at this point we're not sure what's technically feasible). All use the income_tax variable that depends on the taxes.income_tax_rate parameter.

Scenario 1)

taxes.income_tax_rate parameter in income_tax variable dependencies. The parameter value is set in specific entry "taxes.income_tax_rate<2017-01>". This entry differs from variable entry format: it doesn't contain dependencies.

{
  "trace": {
    "income_tax<2017-01>": {
      "dependencies": [
        "salary<2017-01>",
        "taxes.income_tax_rate<2017-01>"
      ], 
      "value": [
        0.0
      ]
    }, 
    "salary<2017-01>": {
      "dependencies": [], 
      "value": [
        0.0
      ]
    },
    "taxes.income_tax_rate<2017-01>": {
      "value": [
        0.15
      ]
    },
  }

Scenario 2)

taxes.income_tax_rate parameter in income_tax variable dependencies. No value listed for the parameter.

{
  "trace": {
    "income_tax<2017-01>": {
      "dependencies": [
        "salary<2017-01>",
        "taxes.income_tax_rate<2017-01>"
      ],
      "value": [
        0.0
      ]
    },
    "salary<2017-01>": {
      "dependencies": [], 
      "value": [
        0.0
      ]
    }
  }
}

Scenario 3)

taxes.income_tax_rate parameter in income_tax variable dependencies. The value of the parameter is given at the same place.

{
  "trace": {
    "income_tax<2017-01>": {
      "dependencies": [
        "salary<2017-01>",
        "taxes.income_tax_rate<2017-01>= 0.15"
      ], 
      "value": [
        0.0
      ]
    }, 
    "salary<2017-01>": {
      "dependencies": [], 
      "value": [
        0.0
      ]
    }
  }

Scenario 4)

taxes.income_tax_rate parameter in income_tax variable dependencies. The value of the parameter is given at the same place and the variable is updated to same format.

This is a breaking change for existing code on /trace.

{
  "trace": {
    "income_tax<2017-01>": {
      "dependencies": [
        "salary<2017-01> = 0.0",
        "taxes.income_tax_rate<2017-01> = 0.15"
      ], 
      "value": [
        0.0
      ]
    }, 
    "salary<2017-01>": {
      "dependencies": [], 
      "value": [
        0.0
      ]
    }
  }

Scenario 5)

income_tax variable dependencies entry becomes variables and parameters (with or without parameter value).

This is a breaking change for existing code on /trace.

{
  "trace": {
    "income_tax<2017-01>": {
      "dependencies": [
        "variables": [
          "salary<2017-01>"
        ],
        "parameters": [
          "taxes.income_tax_rate<2017-01>"
        ]
      ],
      "value": [
        0.0
      ]
    }, 
    "salary<2017-01>": {
      "variables": [],
      "parameters": [],
      "value": [
        0.0
      ]
    }
  }
}
guillett commented 6 years ago

Thanks a lot @sandcha.

If I had to pick a scenario now I would pick the first 1. However, I don't think picking an alternative now is important. To me, this looks like an implementation detail. I assume most of the work would not change between the different scenarios.

bonjourmauko commented 6 years ago

Hi @sandcha and @guillett.

If I understand correctly the last comment, as long as there is (contraints):

We're ok.

Question: Do we need to reflect them recursively, or just depth 1 is ok?

I'll invite @sandcha @fpagnoux and @Morendil to join into the technical details of a possible POC, although my current understanding is:

  1. If we do an upfront analysis and pickle it (ex: traversing AST or taking a look at the bytecode), we'd still need to inject/calculate periods and values dynamically at run time.

  2. If we do a pure dynamic analysis (ex: decorating functions to establish a call tree, we'll face the problem of parameters being passed on as input.

fpagnoux commented 6 years ago

Thanks @sandcha for the research work on the different possibilities!

I'd rule out Scenario 3 and 4, which would involve some annoying parsing to get the value of the parameter.

2 and 5 are don't contain the parameter values, which I think was requested for this feature.

However, I find 1 a little heavy and confusing between variables or parameters, so I'd be tempted by:

Scenario 5bis)

{
  "trace": {
    "income_tax<2017-01>": {
      "dependencies": {
        "variables": [
          "salary<2017-01>"
        ],
        "parameters": {
          "taxes.income_tax_rate<2017-01>": 0.15
        }
      },
      "value": [
        0.0
      ]
    }, 
    "salary<2017-01>": {
      "variables": [],
      "parameters": [],
      "value": [
        0.0
      ]
    }
  }
}

(The slight difference with 5 is that I use an object instead of an array for parameters, to be able to include the value)

fpagnoux commented 6 years ago

However, I think it's not easy at all to get the information we need

We are able to know which variables is called by a formula because it's a function call. When person('salary', period) is called, we can easily sneak in a call to the tracer. However, parameters(period).benefits.basic_income.amount is not a function call. parameters(period) is dict-like data structure. So we can't just replicate what we have already done for variables.

From there, I can see 3 possible strategies:

  1. Playing with magic methods:

    1. We can use __getattribute__ (magic method called when using the . to get an attribute) or __getattr__ (magic method called when trying to get an attribute that doesn't exist) to be able to sneak in tracer calls. This is doable, but:
    2. We need to be careful about performances, especially with __getattribute__. Functions calls are usually slower than simple attribute access.
    3. If we are basing parameters navigation on __getattr__, we loose autocompletion (in interative mode, for instance while debugging, my terminal cannot know that P.benefits exists). Maybe that's not a big deal, but this is to keep in mind.
  2. Building the graph statically, using @Morendil's work

    1. If we have the "static" dependency tree, and the period, we could infer the parameters used and their values. (EDIT: actually, not really, because we are not sure that the parameter is called for the period the formula is called for)
  3. Changing the way we access parameters.

    1. Several times, legislation writers were puzzled by the big differences in calls to calculate variables and parameters. parameters could be called with something like parameters('benefits.basic_income.amount', period)
    2. This may too big of a change for what we are trying to accomplish though...

~I'd be tempted to go with 2, as it has interesting externalities, but no strong certitude 🙂~ Actually, given the period issue, I'd try to get 1 working without too much perf overhead.

sandcha commented 6 years ago

And in a function, we can access local context with locals(). It might be possible to export them outside the local scope of the function.

guillett commented 6 years ago

@fpagnoux do you have a branch we can play with ? I have diffs between versions and I would not be surprised they come from parameter changes.

sandcha commented 6 years ago

✅ TODO: POC with calculated impact on performance