ARA-Trans / iAM

iAM - Infrastructure Asset Management
GNU Affero General Public License v3.0
4 stars 4 forks source link

Investigate architectural refactoring of domain area: analysis #632

Open rummelsworth opened 4 years ago

rummelsworth commented 4 years ago
rummelsworth commented 4 years ago

I found a FOSS (MIT) library that checks almost all the boxes for replacing CalculateEvaluate:

https://github.com/sklose/NCalc2

rummelsworth commented 4 years ago

A small update here, especially re the previous comment: NCalc was ultimately not appropriate for our use case. However, an ANTLR/LINQ-based rewrite was, and the benchmark so far shows a 400x speedup over the legacy CodeDom-based CalculateEvaluate --- about 120 microseconds per expression with the former versus about 50 milliseconds per expression with the latter. For example, extrapolating from this particular benchmark, this means a "cold start" analysis with 1 million equations could compile everything from scratch in 2 minutes instead of 14 hours. Also, just to be clear, all the above checkboxes are checked by this solution (type support, restricted grammar, compiled to IL, custom functions, dynamic parameters).

The primary goal of this rewrite was to clean up the CalculateEvaluate API for use within the refactored analysis module. However, in light of the unexpectedly large performance gain, Chad has asked me to spend a few hours investigating the level of effort associated with replacing the old CalculateEvaluate module with this new CalculateEvaluate module before the integration of the new analysis module.

rummelsworth commented 4 years ago

As of a couple weeks ago, the refactored analysis code is nominally complete. Per the update given at the last iteration review meeting, I spent the past couple weeks on integration testing, specifically writing a configurable test program to draw analysis inputs from an existing sample database and verify that the implementation runs to completion and any structural bugs are sussed out. This is ongoing.

After it's verified to (a) run to completion and (b) produce the expected output structure, there will need to be a review meeting (with Gregg, Chad, and possibly Jake), to answer about a dozen (so far) outstanding questions on how the analysis handles certain edge-case situations. The answers to these questions will need to be accounted for in the implementation, with subsequent re-verification.

After the answers are accounted for, the new output results will need to be verified against expected results from the legacy analysis implementation.

Per a message from @jakedw7 this morning, my work on this project is to fully stop immediately, to resume no earlier than July.