Closed agzimmerman closed 2 years ago
Firedrake changed their plotting API (I think much for the better) and unfortunately broke backward compatibility.
Unfortunately older Docker images don't appear to be archived, so my Travis CI process that pulls the Docker image from Firedrake's Dockerhub is currently broken.
I asked on Firedrake's Slack team about whether the Docker images are archived anywhere.
It's probably not going to be too difficult to upgrade Sapphire to the latest version of Firedrake, but it could be a larger distraction than I want right now. Anyway, I will give it a quick try, starting with installing the latest Firedrake locally and running all of the tests.
For the brine plume regression test, the nonlinear solver behavior is unchanged until after solver t = 0.023375.
Then in the new Firedrake version, a solution is found where before a solution wasn't found. In the old Firedrake version, the time step size was reduced. In the new version, the solution has a negative porosity value and therefore the simulation is halted.
Note that these are both running the current branch of this PR. Only the Firedrake environments are different.
I asked for help from Firedrake's Slack team:
Alright, I have one failing regression test, and it fails in a way that I think will be very hard for me to find the root cause of. Given the same time-dependent nonlinear problem with the same solver parameters, many time steps problem proceed exactly the same, until at some time the nonlinear solver's convergence behavior changes. https://github.com/geo-fluid-dynamics/sapphire/pull/75
The regression test passes locally with Firedrake that I built on 2019-05-17 and it fails in the same way that the Travis test fails when I locally run with Firedrake that I built yesterday. That's a large time window (resulting from not prioritizing this PR).
If anyone has tips on how to narrow down what might have changed about PETSc's Newton LS in the past ten months, do let me know! Maybe Patrick Farrell has some ideas.
On the other hand, it's only a regression test that's failing. All of the tests for my verified and validated simulations pass. It's just my latest R&D which isn't verified/validated which is broken now. So I might just have to accept the change in behavior and come up with a different test.
It's already on the salt water roadmap to also model the solidus, apply the same regularization procedure (Gausssian convolution), and therefore negative porosity values will no longer be mathematically possible. This means that the failing regression test will become obsolete. So I won't get too hung up on fixing this.
Matt Knepley on Firedrake's Slack said he guessed that the line search changed, and to look into this with solver options -snes_view and -snes_linesearch_monitor.
I ran the test again with both versions, using the new options. Both logs are attached here. log-firedrake20190517.txt log-firedrake20200313.txt
Here is my analysis that I shared with Prof. Knepley on Slack:
As the test simulation proceeds through solve each time step, a time comes where some minor differences in the SNES function norms and the line search reports appear. After some more time steps I see the first major change in the nonlinear solver behavior, with a different number of nonlinear iterations needed for convergence. Not long after that comes the first time in the simulation where the same timesteps cannot be solved, so the two versions of the simulation begin branching. The catastrophic failure for the new version comes yet later.
The SNESLineSearchObject reports look identical, meaning the parameters are all the same. In the PC Object report, particularly the MUMPS run parameters, I see some new statements about BLR. Also now there is an "estimated compression rate of LU factors", so maybe BLR is changing something.
A bit of Googling leads me to believe that indeed something related to BLR has been changed in MUMPS between when I built these two versions of Firedrake, so maybe that's it. But the logs from both versions say "ICNTL(35) (activate BLR based factorization): 0 ", which makes me think that BLR shouldn't be doing anything.
So yes the behavior of the line search changed, but I do not yet see why it should have, unless it's really the BLR change.
In any case, we don't want one of our test simulations to be this fragile. I am more interested in re-doing the brine plume simulation with a regularized solidus implemented.
@agzimmerman Can you edit this comment to remove my name (pfarrell)? I am a different person than the once referenced in the pasted text from Firedrake's Slack comment and it seems github autolinked me. Because of this, this PR continues to show up in my "mentioned" tab :).
Yes I changed it now :)
@agzimmerman Can you edit this comment to remove my name (pfarrell)? I am a different person than the once referenced in the pasted text from Firedrake's Slack comment and it seems github autolinked me. Because of this, this PR continues to show up in my "mentioned" tab :).
Yes I changed it now :)
Also now, pfarrell, you are considered a participant, I think because of your comment. I will delete the comment (that is quoted above).
I found problems with the binary alloy formulation used on this branch and had issues with stability of the brine plume simulation. I have a new branch with a new formulation that is still in work. That may eventually become a new PR.
Also adds validation test for diffusive solidification of salt water