Closed cloudy-bot closed 5 years ago
Diff of SAVE DR output between single precision and alldouble run Attachment: diff.png
@peter-van-hoof-noaccount commented:
The botch was fixed in r12055.
The problem was not due to ionization convergence failure, but due to two other problems in zone 0 for iteration >=2:
- the database species were remaining trimmed as the temperature recovered from that at the rear face of the calculation (fixed at r12134)
- radius_next() needs to be called in zone 0 to update normalization values for later zones (fixed at r12135)
These changes address the problem as reported. However, quite a lot of the logic in the iteration loop in cloudy.cpp, the zone stepping in radius_first and radius_next and level trimming could do with significant further rationalization and tuning.
Milestone renamed
reported by: peter
Starting from r11933 the alldouble versions of pdr_xdr started botching on the number of zones on the trunk. Analysis showed that there is a bimodal distribution of the number of zones with the alldouble runs centered around 867 zones and the single precision runs around 896 zones with a spread of only a few zones. This difference doesn't seem to affect the physical results of this sim, which is why the botch will be fixed by placing the monitor in the middle.
However, deeper analysis did show some unwanted behavior that could point to the fact the hydrogen ionization fraction is not converged properly at the start of the last iteration and it takes up to two zones to converge this properly (despite the fact that no convergence failures were reported). This problem may or may not affect physical results in other sims, but at the very least it will waste some CPU time.
A detailed analysis was shown in emails from Peter van Hoof to cloudtests dated 2018-03-27 12:53, 2018-03-28 02:03, and 2018-03-28 15:37 (all times are CEST). This is the summary of the problem.
The sim needs to adjust the H ionization fraction at the inner face at the start of the last iteration. The single precision run needs 2 zones to stabilize that, while the alldouble run can do it in a single zone. It should have been converged already before the first zone was entered. This has knock-on effects on the stepsize and eventually leads to a dichotomy in the total number of zones. This problem does not affect physical results as Marios already showed. But it does pessimize CPU time somewhat by wasting zones. It could also point to a problem in the ionization solvers declaring convergence when it is not really reached yet, but it could also be that the solvers are being duped by the rest of the code. This may be an old problem.
We used to have in some places code looking like
if( nzone > some_small_number ) do full physics
or
if( nzone < some_small_number ) relax constraints
or something other along those lines (here "some_small_number" is a hardcoded small number, typically 2, 3, or 4). So in older versions of the code, "converging" the initial solution typically took several zones. This may still be a remnant of that, but I have not checked. We have made efforts to remove this kind of code, but I am not sure whether it all has gone by now...
I am convinced that the botch is not the result of recent changes. It looks like it is not changing results in this sim, but that is not a guarantee that it will never cause problems in other sims. We are likely wasting some amount of CPU time doing things this way. We could put the monitor in the middle (881 zones) and close the issue. I don't think I have the time to dig deeper into this and I guess nobody else has either.
Attached is a diff of the SAVE DR output of a single precision and alldouble run of pdr_xdr.in.
Migrated from https://www.nublado.org/ticket/413