Fplan 32 unit tests - Githubissues

cmovic commented 5 months ago

A test framework strawman using pytest. The tests are just dummies so comments on the framework are welcome. If positive, I'll flesh out the tests.

Also I wasn't sure if I wanted to test loads of the example files or keep some separate for ease of maintenance. For now there's one of each and some notes about pros and cons.

cmovic commented 5 months ago

These changes are for issue #32 - I don't know how to link them (or if I should).

wscott commented 5 months ago

This looks useful.

So I was thinking of using some focused examples like this one:

returns = 0
inflation = 0
startage = 30
endage = 80

[taxes]
taxrates = [[0, 20]]
stded = 100_000            # standard deduction
cg_tax = 0 # doesn't work

[aftertax]
bal = 500_000
basis = 500_000

[roth]
bal = 9_000_000
contributions = [[30, 9_000_000]]

Where we predict the answer will be exactly 100k spending every year and then verify that happens.

And others where we can verify the tax calculations.

cmovic commented 5 months ago

Thanks for the feedback. I'll flesh out the load_file tests, then try creating a solver test based on the above inputs.

cmovic commented 5 months ago

While building a solve() test I ran into a snag where globals declared in the test code are not visible in the fplan functions being tested. I think the best way forward is to refactor the globals as follows:

Move these constants into the Data class:

    global vper, n0, n1
    vper = 4        # variables per year (savings, ira, roth, ira2roth)
    n1 = 2          # before-retire years start here
    n0 = n1+S.workyr*vper   # post-retirement years start here

Then, pass the configuration data S to solve(), print_ascii, and print_csv.

I'll try to make this refactoring as minimally invasive as possible.

cmovic commented 5 months ago

Solver tests added. I'm really curious if others get the same results as I do. My expectation is that we'll see small numerical differences from different versions of scipy.

cmovic commented 5 months ago

Agreed. I’m amazed this works at all but have noticed that Python is far less sensitive than C or C# but that’s no excuse for doing floating point compares.

I’ll search for some off the shelf solutions and/or code up a comparison function. FWIW rounding doesn’t work because you can still have 1 bit differences that round up or down. As long as we don’t have any huge numbers (astronomical huge, even $1B is manageable), some flavor of “fabs(a

b) < epsilon” will suffice.

On Saturday, February 3, 2024, Wayne Scott @.***> wrote:

@.**** commented on this pull request.

In test/fplan/test_load_file/test_load_file.py https://github.com/wscott/fplan/pull/34#discussion_r1477078611:

assert cfg.worktax == 1.25

assert cfg.retireage == 65

assert cfg.numyr == 35

assert cfg.aftertax['bal'] == 212000

assert cfg.aftertax['basis'] == 115000

assert cfg.IRA['bal'] == 420000

assert cfg.IRA['maxcontrib'] == 18000

assert cfg.roth['bal'] == 50000

assert cfg.roth['maxcontrib'] == 11000

assert cfg.roth['contributions'] == [[54, 20000], [55, 20000]]

TODO Determine if this is fragile on other python versions. FP compares like this give me the willies.

assert cfg.income == [0, 0, 0, 0, 0, 47802.892452600514, 48806.75319410512, 49831.69501118133, 50878.16060641613, 51946.60197915086, 53037.480620713024, 54151.26771374799, 55288.4443357367, 56449.50166678716, 57634.94120178968, 58845.27496702727, 60081.02574133483, 61342.72728190286, 62630.924554822814, 63946.17397047408, 65289.043623854035, 66660.11353995497, 68059.975924294, 69489.23541870418, 70948.50936249696, 72438.42805910938, 73959.63504835068, 75512.78738436603, 77098.55591943773, 78717.6255937459, 80370.69573121457, 82058.48034157006, 83781.70842874303, 85541.1243057466, 87337.48791616729]

I can't imagine this will work. Even switching compiler knobs on x86 will break this even with the same python versions. I would suggest rounding all the numbers of the nearest 100 and then comparing.

— Reply to this email directly, view it on GitHub https://github.com/wscott/fplan/pull/34#pullrequestreview-1861044939, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIMUR5ZIYJGRWLDUURJ3BTYRZGYTAVCNFSM6AAAAABCQQXVE6VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTQNRRGA2DIOJTHE . You are receiving this because you authored the thread.Message ID: @.***>

cmovic commented 5 months ago

Looks like “math.is_close” is our friend here. There’s a numpy version too.

On Saturday, February 3, 2024, Victor Lee @.***> wrote:

Agreed. I’m amazed this works at all but have noticed that Python is far less sensitive than C or C# but that’s no excuse for doing floating point compares.

I’ll search for some off the shelf solutions and/or code up a comparison function. FWIW rounding doesn’t work because you can still have 1 bit differences that round up or down. As long as we don’t have any huge numbers (astronomical huge, even $1B is manageable), some flavor of “fabs(a

b) < epsilon” will suffice.

On Saturday, February 3, 2024, Wayne Scott @.***> wrote:

@.**** commented on this pull request.

In test/fplan/test_load_file/test_load_file.py https://github.com/wscott/fplan/pull/34#discussion_r1477078611:

assert cfg.worktax == 1.25

assert cfg.retireage == 65

assert cfg.numyr == 35

assert cfg.aftertax['bal'] == 212000

assert cfg.aftertax['basis'] == 115000

assert cfg.IRA['bal'] == 420000

assert cfg.IRA['maxcontrib'] == 18000

assert cfg.roth['bal'] == 50000

assert cfg.roth['maxcontrib'] == 11000

assert cfg.roth['contributions'] == [[54, 20000], [55, 20000]]

TODO Determine if this is fragile on other python versions. FP compares like this give me the willies.

assert cfg.income == [0, 0, 0, 0, 0, 47802.892452600514, 48806.75319410512, 49831.69501118133, 50878.16060641613, 51946.60197915086, 53037.480620713024, 54151.26771374799, 55288.4443357367, 56449.50166678716, 57634.94120178968, 58845.27496702727, 60081.02574133483, 61342.72728190286, 62630.924554822814, 63946.17397047408, 65289.043623854035, 66660.11353995497, 68059.975924294, 69489.23541870418, 70948.50936249696, 72438.42805910938, 73959.63504835068, 75512.78738436603, 77098.55591943773, 78717.6255937459, 80370.69573121457, 82058.48034157006, 83781.70842874303, 85541.1243057466, 87337.48791616729]

I can't imagine this will work. Even switching compiler knobs on x86 will break this even with the same python versions. I would suggest rounding all the numbers of the nearest 100 and then comparing.

— Reply to this email directly, view it on GitHub https://github.com/wscott/fplan/pull/34#pullrequestreview-1861044939, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIMUR5ZIYJGRWLDUURJ3BTYRZGYTAVCNFSM6AAAAABCQQXVE6VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTQNRRGA2DIOJTHE . You are receiving this because you authored the thread.Message ID: @.***>

wscott commented 4 months ago

FWIW rounding doesn’t work because you can still have 1 bit differences that round up or down.

That just isn't true. A single-bit error can mean the difference between 3.00000001 & 2.99999999, but both would round to the same number.

The isclose() approach is fine as long as the epsilon is way above the error from the solver. Most of these values are money and just getting within $1 is perfectly fine.

Also, it is straightforward to generate an input where spending for some years are unconstrained and the answers can vary a lot. For example, if it may be that the years right after retirement or before social security set the max spending. So later years have lots of possible solutions. For these, the solution can move around.

cmovic commented 4 months ago

Hmm. I wasn't done with my changes. I'll create another fork and finish up.

As for rounding, a single bit will still cause differences. Your example of 3.000000001 and 2.99999999 both rounded to nearest unit both result in 3.0, but 2.500000001 and 2.499999999 will unit round to 3 and 2 respectively. I think an edge case like this exists for any floating point rounding.

wscott / fplan

Fplan 32 unit tests #34

@.**** commented on this pull request.

TODO Determine if this is fragile on other python versions. FP compares like this give me the willies.

@.**** commented on this pull request.

TODO Determine if this is fragile on other python versions. FP compares like this give me the willies.