Closed anilyil closed 2 years ago
Do any of you have a simple test we can include for the embedding test @marcomangano @sseraj ? Just a small FFD will be fine, we can manually add a handful of coordinates that would have been problematic with the original implementation
@anilyil I don't have a small example but I am pretty sure I could try and reproduce the previous issue I fixed at the pyGeo level in this PR. There are a few changes in the code that it is probably quicker to discuss offline, do you and @sseraj have time later today or early next week? It looks we'll also need to retrain the tests for both pySpline and pyGeo. This could also be a good opportunity to bring up again regression tests on a full optimization problem, including checking the final result and not just the initialization as we are doing now.
I am fine with adding a catch that changes it to backtracking if quadratic approx fails (like step size less than 0.1)? I am surprised changing the exit condition is not what broke the tests. I was expecting the linesearch to be inactive for most of the cases we are doing. Regardless of the tests getting broken or not, real world applications will change if we change the tolerance check.
I do not know much about this issue and both @marcomangano and @sseraj have experience with it, so I am going to remove my review. Let me know if you end up wanting a third review later on -- I can add myself back.
Merging #47 (16c4eb1) into main (7f0ff6d) will decrease coverage by
7.98%
. The diff coverage is66.66%
.
@@ Coverage Diff @@
## main #47 +/- ##
==========================================
- Coverage 60.51% 52.53% -7.99%
==========================================
Files 5 5
Lines 1616 1616
==========================================
- Hits 978 849 -129
- Misses 638 767 +129
Impacted Files | Coverage Δ | |
---|---|---|
pyspline/pyVolume.py | 40.80% <ø> (-27.28%) |
:arrow_down: |
pyspline/utils.py | 48.36% <66.66%> (ø) |
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
This turned out to be a much bigger headache than I thought... The original solver had a bug that is baked into all of our tests. See this termination criteria here: https://github.com/mdolab/pyspline/blob/7f0ff6d993eb7b5c0299cc169e70f6d360aaa68a/src/projections.F90#L565-L569
The code checks if the current iteration has differed by eps from the previous iteration. If the detected change is less than eps, the code keeps the previous point and exits. This was originally done to detect points that are outside the volume and avoid running all the way until max iterations are reached.
However, there are several cases where a situation like this happens: Say the eps value is set to 1e-10. The current point is around 1.1e-10. The next iteration the solver takes puts the point right inside eps; however, because the previous point was already very close, this update is very small as well, more importantly, below eps. So the code thinks this point got stuck and is not improving, and keeps the original point at 1.1e-10 convergence. The correct solution is to take the new point since it satisfies the tolerance, whereas the old one does not. The new algorithm avoids this check. I have actually revised the exit criteria quite a bit; I am also aiming to detect points that are outside the volume and terminate if we are not making progress towards the point by checking the projected gradient.
The downside of this fix is that we now have 2 tests that fail in ADflow. The mismatch is tiny; the test test_adjoint.py:TestAdjoint_3_rans_tut_wing.test_adjoint
results in:
Mismatched elements: 1 / 72 (1.39%)
Max absolute difference: 2.48547281e-08
Max relative difference: 1.37676067e-10
and the test test_adjoint.py:TestAdjoint_1_euler_scalar_JST_tut_wing.test_adjoint
results in:
Mismatched elements: 2 / 72 (2.78%)
Max absolute difference: 1.72756245e-08
Max relative difference: 3.53686769e-10
These mismatches might be due to me running this on an ARM machine, but the tests do pass when I have pyspline/main compiled. The solution to this will be re-training these two adflow tests.
Another big change that was needed here was the volume test in pyspline. Previously, it was using a least squares fit based volume using data points. The data points when plotted in 3D looks like this:
However, the fit volume results in this hot mess:
Our projection and evaluation tests were trained with this, which was essentially just making sure the code fails exactly the same way and any change would cause them to fail.
To address this issue, I have modified the test to use the "data" array as control points while creating the b-spline volume. I still loop over a range of b-spline orders in 3 directions, but the volume itself is created directly by passing control points. Then the test evaluates a range of points inside the volume and runs the projection tests with these. I also test points that are outside the domain to check if the solver can still find the nearest point in the domain.
Now that I modified the test, some methods related to getting information from the underlying data used to fit a volume is not tested. I have created an issue #49 about the volume interpolation and ideally, we need to add a new test that actually does interpolate correctly to address this. I don't have time to fix this now. I do not know if anyone is using this feature.
Finally, I am adding a challenging FFD test to pyGeo in https://github.com/mdolab/pygeo/pull/157. The FFD is based on @sseraj's cases. The points added to the FFD does not embed properly without the solver fixes in this PR, and they do embed correctly with the fixes here.
To dos:
val_diff
checkThis is ready for the second round. The bounds check works as expected for the tests I have tried so far.
Purpose
The projection routine used to embed points in volumes had 2 shortcomings:
The changes has the potential to slightly change all cases that use FFDs. How this would show up is that due to the line search change, the code might take a slightly different convergence path, or due to the tolerance change, the code might do an extra iteration. The changes should be on the order 1e-10 or less, but it is still important to note. Within a case, it does not affect derivative accuracy, but comparing against old cases, it will affect the recorded functions (e.g. our regression tests).
Expected time until merged: About a week or two.
Type of change
Testing
Checklist
flake8
andblack
to make sure the Python code adheres to PEP-8 and is consistently formattedfprettify
or C/C++ code withclang-format
as applicable