dgorissen / pycel

A library for compiling excel spreadsheets to python code & visualizing them as a graph
GNU General Public License v3.0
565 stars 147 forks source link

Evaluate Starts Returning None After Successfully Evaluating One Cell in Circular Formula Chain #126

Open JSv4 opened 3 years ago

JSv4 commented 3 years ago

What actually happened

I have a workbook with a circular formula chain. The workbook has iterative calculations disabled when received (I cannot change this). I've used Pycel's iterative calculations API as described further here to override the workbook settings.

When I create an ExcelCompiler instance and tell it to evaluate one of the cells in the circular chain, I get the expected answer. So far so good.

If I create another ExcelCompiler instance and instruct it to evaluate a different cell in the circular chain, that works as expected. Again, so far, so good.

Now, if I create a single ExcelCompiler instance and use the same instance to perform both evaluations, the first evaluate call works fine but the second evaluate call returns None instead of the expected answer (the one Excel calculates just fine and, weirdly, pycel calculate fine). If I reverse the order, I get the same thing, just in the opposite order, i.e. I can calculate E6 correctly first but then get None for E10 or I can calculate E10 properly first and then get None for E6. If I create a new compiler obj for each evaluation, however, both cells calculate properly.

What was expected to happen

I should be able to keep using the same ExcelCompiler object to evaluate any cell in the formula chain successfully or I should get an error message.

Problem description

I haven't had the time to try to trace where this is coming from, but it cannot be desired behavior to need to reload the excel sheet for each cell that needs to be evaluated. According to the example, you shouldn't need to do this. I am currently getting around this issue successfully by creating a new ExcelCompiler for each evaluation call.

Code Sample

I think this is related to the workbook I'm working on (which I cannot share, sadly). The workbook calculates fine in Excel is produced by an automated system that produces many thousands of these without issue, so I don't think there's anything wrong with it. Most workbooks from the same system do not seem to cause this problem in pycel, and, weirdly, like I said, for only a couple examples of this type of workbook, pycel can calculate any one cell fine but then returns None for subsequent evaluate calls using the same obj. If I load the problem spreadsheet in pycel from the python console and create an ExcelCompiler obj with it, I can get a solution for either of two targets cells in the formula chain using that compiler instance, but not both cells:

# I can swap these and get exactly the same behavior, just in reverse order
# i.e. I can calculate E6 correct first and get None for E10 or I can calculate 
# E10 properly first and then get None for E6. If I create a new compiler obj
# for each evaluation, both cells calculate properly

cell_addr = "E6"
cell_two_addr = "E10  

excel= ExcelCompiler(
                filename=filename,
                cycles=True
            )

print(excel.evaluate(
      f"{cell_addr}",
      iterations=1000,
      tolerance=0.001)

# Returns expected value

excel.evaluate(
      f"{cell_two_addr}",
      iterations=1000,
      tolerance=0.001)

# Returns none... BUT value expected!

I am not seeing this behavior with a very simple circular formula I constructed in a separate sheet that I had hoped to be able to share to let you duplicate this behavior, but no dice. I know this isn't a lot to go on, but it looks like this behavior is originating due to something how pycel handles certain circular formulas because the values calculate properly in Excel and, weirdly, in pycel, just only once per compiler instance.

Environment

Pycel version 1.0b27 Windows 10 Anaconda / Python 3.9

stephenrauch commented 3 years ago

@JSv4 Thank you for the report. Like you I am not managing to duplicate.

Can you describe the basic shape of the network of the circular cells? This is likely where any bugs would lie. If you were to take the failing sheet, and simplify the proprietary calculations to just a chain of cell references (=f4 * 0.5), etc, but keep the dependency order of the cells involved, you might be able to duplicate that way.

If that can be done, please serialize the sheet to yaml and post that here. Thanks again.

stephenrauch commented 3 years ago

@JSv4 Additional info that could be useful (ie: provides some more clues)

JSv4 commented 3 years ago

@stephenrauch, I can come up with a sanitized version of the current workbook. Hopefully I can put that together in the next day or two and then will post it here.