iza-institute-of-labor-economics / gettsim

The GErman Taxes and Transfers SIMulator
https://gettsim.readthedocs.io/
GNU Affero General Public License v3.0
56 stars 33 forks source link

BUG: Endogenous creation of `bg_id` #763

Open MImmesberger opened 5 months ago

MImmesberger commented 5 months ago

The current creation of bg_id is wrong because children with eigenbedarf_gedeckt=True will always have a different bg_id than their parents, even if the parents have eigenbedarf_gedeckt=True as well.

Proposed solution

Parents should have the same bg_id as their children if all of them have eigenbedarf_gedeckt=True.

hmgaudecker commented 5 months ago

Sounds okay as a short-term hack. But we'll have to be careful that

  1. both parents always have the same value of eigenbedarf_gedeckt
  2. we treat children with eigenbedarf_gedeckt == False correctly (thinking of patchwork here).

Ultimately, however, eigenbedarf_gedeckt really needs to go -- it is just a synonym for "separate bg's" and with the suggested change, its name is not even related closely to the content anymore.

I am rather convinced now that we won't be able to solve all of this with one pass of GETTSIM -- unless we find a way of working with temporary/candidate variables in a subgraph.

What I mean is that we'll need to check multiple things to determine the bg_id, which depend on the configurations of bg's within a household. So we"ll have to check various candidate configurations (children out? Bürgergeld even relevant? Possible wthh_ids?), which determine the payments. Example from going through budget sets of exemplary households:

Mir ist folgendes Problem aufgefallen:

  • an dem Punkt, an dem Haushalt von "Elternteil ALG 2 + Kind WG" wechselt zu "gesamter Haushalt WG", fällt das Einkommen des Haushaltes ein gutes Stück (obwohl das Bruttoeinkommen steigt)
  • Der Grund ist, dass beim Elternteil ALG 2 für eine Person verglichen wird mit WG für eine Person. Hier ist dann irgendwann WG günstiger (auch schon recht früh durch den Kindergeldübertrag). Ausgezahlt wird das Wohngeld dann aber nicht für zwei Einpersonenhaushalte (wie in der Günstigerprüfung angenommen), sondern das Wohngeld wird für beide Personen berechnet, was weniger ist und das verfügbare Einkommen reduziert.

In the end, this is sort of the same as calculating candidate benefits before checking eligibility / priority. We solve that by giving things names like xxx_vor_vorrang_check. But that is ugly and not feasible for this much more complex case. There might be a better way of solving both of these.

My suggestion would be to do a quick hack here as you suggest, do the transition to a hierarchical scheme, and revisit.

MImmesberger commented 5 months ago

For future reference, here is the logic that I'm using in the policy project:

The problem of setting the bg_id boils down to 1) calculation eigenbedarf_gedeckt and then setting bg_id correctly and then 2) making sure that no BG covers their own needs individually but not together (because Wohngeld for the group is lower than the sum of the individual Wohngeld claims).

I used the following procedure:

  1. Set bg_id to be different for each child and the parents (i.e. assuming that each child covers it's own needs, e.g. by setting eigenbedarf_gedeckt=True for everyone).
  2. Calculate whether Eigenbedarf is covered given the bg_id from (1) and then set another candidate bg_id based on the results.
  3. Then, calculate for this bg_id from (2) again whether the total Eigenbedarf is covered.
  4. Set the final bg_id based on the following logic:

Potential Outcomes:

To calculate whether Eigenbedarf is covered, I use the wohngeld_kinderzuschl_vorrang_bg target of GETTSIM.

This is still not entirely correct once there are more than two children who are not entirely symmetric (e.g. different age or allowances). I.e. it could still happen that each child (+ parents) individually satisfy their own needs, but in certain combinations they don't.

hmgaudecker commented 5 months ago

Sounds very good! I'd like to suggest a different strategy, I won't describe the algorithm quite as elaborately as you did. But it will be better to draw that on a board, anyhow.

  1. Calculate Bürgergeld claim for the whole fg, excluding any child income except for kindergeld.

    1. If there is no claim, set a flag to ignore these observations in all the rest. Everything below can never matter then; better not to have to think about it.
    2. If there is one, continue with the logic in 2.. I.e., we focus on cases where someone is eligible for Bürgergeld for sure.
  2. Check whether some children cover their needs and set bg_ids. Do not use wohngeld_kinderzuschl_vorrang_bg for this check but something like what Christian posted here.

    The reason is that we do not want to have Wohngeld in that check. And Kinderzuschlag even less so (can only be paid if living with adults in the same bg). If children are taken out of a bg only because of Kinderwohngeld, doing so is optional and probably quite rare.

    When children cover their needs, set a flag that we do not need to do any Bürgergeld priority checks for them below / they will always be in the wthh which does not receive Bürgergeld. Set one bg_id per child; this means we should have stuff like regelbedarf_m at the individual level (#668).

  3. Calculate Wohngeld. In order to prevent the issue quoted above, we might need to do that twice:

    1. Setting the wthh_id given the current configuration of bg_ids, assuming that the "main" bg_id (parents, plus children who cannot cover their needs) receive Bürgergeld.
    2. Setting the wthh_id assuming that the entire fg does not receive Bürgergeld

    Then check which of the two is more favourable / allowed at all.

    We need to do very careful testing here as this seems like the most fragile step, where weird things might crop up. Say the check in 1. shows a Bürgergeld claim, but then children are taken out because of their income and suddenly the parents don't have a claim anymore for themselves? This might be not get rid of all drops in income as noted above -- we should do a lot of budget graphs here. I just find it very hard to think through all of that in an abstract sense.

Anyhow, I hope this prevents the circular logic (split up fg -- do individual checks in which nobody is eligible for Bürgergeld -- merge again -- split up fg -- ...) that plagued us in that issue quoted above because of the up-front check in 1.i. It also gets rid of the issue you mention in the original bug report above.

MImmesberger commented 5 months ago

That's great, it's much easier to think about the problem this way! In my algorithm above, I inadvertently tried to account for Kinderwohngeld which made this super complex.

I think (1.i) doesn't work. The whole fg may not be eligible for ALG2 because children receive high Unterhalt, but in the end, parents may be eligible for it after children were removed from the BG. But your algorithm doesn't rely on this.

The rest should work, but it hinges on the Bürgergeld Regelbedarfe to be a linear function of the children included/excluded in the BG (i.e. sum of Regelbedarf of two children BGs = sum of Regelbedarf of one BG of the same two children) which is true I think but we may have to keep that in mind for the future.

I would slightly adjust the wording of (3): After doing (2) we always end up with exactly two candidate wthh_id configurations: i) the group of children that covered their needs in (2) receives Wohngeld, parents and the other children don't (different wthh_id), ii) the group of children that covered their needs in (2) receives Wohngeld, parents and the other children do as well (same wthh_id).

hmgaudecker commented 5 months ago

Cool!

No time right now, but: How about leaving child income out in 1. ? Likely means we'll have to do 3.ii, but should do the job?

hmgaudecker commented 5 months ago

I edited the above algorithm using that idea. Should be clearer than discussing back and forth here.

hmgaudecker commented 5 months ago

The rest should work, but it hinges on the Bürgergeld Regelbedarfe to be a linear function of the children included/excluded in the BG (i.e. sum of Regelbedarf of two children BGs = sum of Regelbedarf of one BG of the same two children) which is true I think but we may have to keep that in mind for the future.

I don't think so. Each child is its own bg (different from Wohngeld!) , but that never matters because by the previous check, these kids do cover their needs.

MImmesberger commented 5 months ago

Say the check in 1. shows a Bürgergeld claim, but then children are taken out because of their income and suddenly the parents don't have a claim anymore for themselves?

I think this is what I meant when I was saying that it works because Regelbedarf is linear in BG members (just didn't realise that Bürgergeld is the relevant object here, not Regelbedarf). This situation can happen since anz_kinder_bis_17_bg determines the relevant SGB II income but I would argue in this situation the whole fg gets Wohngeld?

But of course, you're right. There are probably corner cases that we haven't thought about yet..

hmgaudecker commented 5 months ago

Bürgergeld == Regelbedarf + eligibility check, right? Just need to be careful whether we need the check at some point or not.

I would argue in this situation the whole fg gets Wohngeld?

Hopefully :smile:

MImmesberger commented 5 months ago

Let me elaborate on step 3 a bit more.

When leaving step 2, the bg_id depends on the wthh_id to the extent that we want children to be in the same BG as their parents if they have the same wthh_id (no matter the results from step 2!). If wthh_id is different (i.e. parents receive Bürgergeld, some children receive Wohngeld), the bg_id should be different.

So the crucial step is determining who receives Wohngeld and thereby setting wthh_id. We know that every child who has covered their needs individually must be in the Wohngeld group (Vorrangprüfung). For parents and and the other children it's unclear. There are two candidate constellations:

  1. The parents and the children who did not cover their needs individually receive Wohngeld. Hence, they have the same wthh_id as the children who do cover their own needs and they also share a bg_id.
  2. The parents (and the children who did not cover their needs individually) receive Bürgergeld. They have a different wthh_id and bg_id than the other children in the household.

The tricky thing now is to decide which of the two options is the correct one. I propose to do this the following way:

  1. Set the wthh_ids and bg_ids for the two scenarios above. For each of them, do the Vorrangprüfung and calculate transfer income of the fg (ALG2+WoG+KiZ). The Vorrangprüfung computes, whether the BG covers it's own needs via own income.
  2. If the Vorrangprüfung implies that for candidate (2), needs are covered, the parental BG is not eligible for ALG2 and, hence, we accept candidate (1) as the solution.
  3. If the Vorrangprüfung does not imply that needs are covered, we do a Günstigerprüfung, i.e. we compare transfer income for both candidates and pick the solution that maximizes household/fg income.
hmgaudecker commented 5 months ago

Thanks! Sounds very exhaustive indeed. I need to work through that very slowly at some point.

Would you be able to come up with some test cases that illustrate these points? That would help clarity tremendously. Without knowing exactly what we are trying to achieve, it is just hard to follow multi-step reasoning.

If it helps, add intermediate results for test cases. If there are multiple possible outcomes for a test (you mention it is unclear in some cases what should happen), leave those open and it will be great to discuss them.

MImmesberger commented 5 months ago

I just realised that the Günstigerprüfung in step 3 of the algorithm that you proposed is what makes this complicated. When we only model the Vorrangprüfung, the algorithm above should yield correct results.

The Günstigerprüfung (=people can get Wohngeld already if the transfer is higher than BüG) is, however, important to get rid of jumps of transfer income at the Eigenbedarf threshold. Then, setting the bg_id and wthh_id gets complicated once we have multiple fgs in one household as we want to set the IDs in a way such that they maximize household income. Then, we would need to compare multiple wthh_id candidates on the household level and we're back with an undetermined amount of GETTSIM calls.

One solution would be to do the Günstigerprüfung on the fg level (not on the household level). The downside is that this creates jumps in the disposable income of the household if there are several fgs in the household (because what maximizes income of the fg level does not necessarily maximize income at the hh level; think, for example, about two fgs and for both individually, it's better to receive Wohngeld than BüG, but they're eligible for both. Combining them may make them worse off.).

MImmesberger commented 4 months ago

dagitty-model-2

(... are placeholders for many other functions that come in between which I won't touch)

This is my current plan. The three key targets are wthh_id_endogen, bg_id_endogen and wohngeld_kinderzuschl_statt_arbeitsl_geld_2_endogen.

If the user does not set wthh_id and bg_id themselves, GETTSIM creates the IDs such that children who can cover their needs are not part of the parental BG (this is basically the status quo, just with the adjustment that if the whole fg covers their needs, they form a BG again).

The wohngeld_kinderzuschl_statt_arbeitsl_geld_2 bool must be set by the user, even if wthh_id and bg_id are provided. There are ways to rationalise wohngeld_kinderzuschl_statt_arbeitsl_geld_2 from the inputs if there is only on fg in the household (I think) but it's not possible in general.

The incorrect Günstigerprüfung in the current version drops out without replacement, i.e. if a user wants to compute Wohngeld/ALG2 without guessing wohngeld_kinderzuschl_statt_arbeitsl_geld_2, the user must run GETTSIM twice.

Feel free to suggest other target names if you think something is unclear!

hmgaudecker commented 4 months ago

Thanks! A couple of reactions:

MImmesberger commented 4 months ago

eigenbedarf_gedeckt looks like an input -- that should not be necessary?

No, that's an output generated via

def eigenbedarf_gedeckt(
    arbeitsl_geld_2_regelbedarf_m: float,
    _arbeitsl_geld_2_nettoeink_ohne_transfers_m: float,
    kindergeld_zur_bedarfsdeckung_m: float,
    kind_unterh_erhalt_m: float,
    unterhaltsvors_m: float,
) -> bool:
    """Check if SGB II needs are covered.
    ...

    """
    return (
        arbeitsl_geld_2_regelbedarf_m
        <= _arbeitsl_geld_2_nettoeink_ohne_transfers_m
        + kindergeld_zur_bedarfsdeckung_m
        + kind_unterh_erhalt_m
        + unterhaltsvors_m
    )

As discussed in the other thread, I am no fan (anymore) of things like xxx_nach_vermögensprüfung. Anything that allows us to make absolute checks we should do upfront and then carry along an anspruchsberechtigt or the like.

Definitely, I should have added that here as well.

hmgaudecker commented 4 months ago

eigenbedarf_gedeckt looks like an input -- that should not be necessary?

No, that's an output generated via

Great, just wasn't obvious from the graph. I think we should make that arbeitsl_geld_2_eigenbedarf_gedeckt

hmgaudecker commented 4 months ago

Though that will be clear very soon, anyhow. So no big deal.

MImmesberger commented 4 months ago

We should offer a check whether wohngeld_kinderzuschl_statt_arbeitsl_geld_2_endogen and wohngeld_kinderzuschl_statt_arbeitsl_geld_2 line up, same for the ids. We should then alert users that results are likely wrong

Just a heads up: This would blow up the modules further. This is because the user influences the calculation of e.g. wohngeld_anspruchshöhe_wthh by setting wthh_id and wohngeld_kinderzuschl_statt_arbeitsl_geld_2. My current plan was to create them endogenously such that they are useful for calculating the endogenous bg_ids. This ensured that we only need, e.g. wohngeld_anspruchshöhe_wthh and wohngeld_anspruchshöhe_fg (and all the upstream _fg and _wthh functions).

But I agree, that this would be important and I think the additional lines are worth it.

hmgaudecker commented 4 months ago

My current plan was to create them endogenously

What exactly does "them" refer to in this sentence?

MImmesberger commented 4 months ago

Sorry, I was referring towthh_id, bg_id. If they are not specified by the user, I currently create them such that children who cover their needs are not in the parental BG. Then I compare this candidate to the situation where the whole FG forms the BG.

(I'm not sure yet whether I’ll do the same with wohngeld_kinderzuschl_statt_arbeitsl_geld_2 or create a new variable for this)

hmgaudecker commented 4 months ago

Sure, but we expect them to be set by the user along with wohngeld_kinderzuschl_statt_arbeitsl_geld_2, right? Maybe I don't quite get the whole structure yet. Maybe it is also not quite possible what we are trying to do and we'll just need to have a function calling compute_taxes_and_transfers multiple times to get at the correct configuration of wthh_ids and bg_ids?

MImmesberger commented 4 months ago

Yes, my point is just that the warning you were suggesting above is not possible with the current structure of #778.

If we want such a warning, we could do this via multiple compute_taxes_and_transfers calls.

hmgaudecker commented 2 months ago

Linking a comment on Unterhaltsvorschuss, which we should double-check here.