openfisca / openfisca-core

OpenFisca core engine. See other repositories for countries-specific code & data.
https://openfisca.org
GNU Affero General Public License v3.0
168 stars 75 forks source link

Support resilient indexing #1093

Open nikhilwoodruff opened 2 years ago

nikhilwoodruff commented 2 years ago

When using fancy indexing currently, indexing can break if some of the lookup values are not present. For example, take the following parameter for some e.g. benefit amount defined per US region and number of household members.

CA:
  1:
    2021-01-01: 50
  2: 
    2021-01-01: 60
US:
  1:
    2021-01-01: 40
  2:
    2021-01-01: 45

Many US parameters are defined like this: a default for a large region, and overrides for specific regions, and current solutions of defining specific mappings for each parameter will lead to a lot of code inelegance. Here's an idea for a feature in Core that could completely solve this: possible index variables. E.g. if that same parameter had some extra metadata:

metadata:
  index:
    - state
    - region

... then when indexing, a formula could fetch the value for each entity first by attempting to look up the state value, then the region if that fails. I'd be happy to have a go at this if this would be useful/in the spirit of Core features/we could specify it out in a bit more detail. @benjello @MattiSG how does this idea sound, and are there use cases in other countries too?

MattiSG commented 2 years ago

Thanks @nikhilwoodruff for this suggestion!

I find it elegant, but it definitely seems like a major step away from OpenFisca's current “make everything explicit” mentality. In particular, relying on metadata seems risky since that field has not been normalised yet, and inappropriate since in the given usage it would definitely be data not meta-data :wink:

Naming considerations aside, I'm just extremely wary of avoiding to re-introduce inferences-like systems that were painstakingly removed (see #458). In this case, the mechanism would be explicit, so it is probably not at the same level.

Would a default value for list-type parameters work for you? In the example you gave, something like:

my_parameter:
  values:
    CA:
      1:
        2021-01-01: 50
      2: 
        2021-01-01: 60
  default:
    1:
      2021-01-01: 40
    2:
      2021-01-01: 45

It would be very helpful if you could provide us with clearer, real-life examples along with references to the implemented legislation 🙂 we want to document precisely how any OpenFisca feature is immediately useful for modelling existing rules.

benjello commented 2 years ago

@nikhilwoodruff @MattiSG could we just use a get method for this particular case of indexing ? Instead of

my_parameter[index]

we could use

my_parameter.get(index, default = "default")

or even

my_parameter.get(index, default = default_index)

where default_index should be a valid index otherwise an error should be returned