mabarnes / moment_kinetics

Other
2 stars 4 forks source link

Merge master branch into merge_fkpl_collisions #162

Closed johnomotani closed 10 months ago

johnomotani commented 10 months ago

@mrhardman I guess the things you might want to review are:

The rest of this PR is just the commits from the latest version of master.

johnomotani commented 10 months ago

I've fixed a couple of small (and not too significant) shared-memory errors (c6b081b); added some fixes for the the debug routines; and improved the support for vperp.bc settings, so that the existing 2V tests (that don't use "gausslegendre") can still run (bf8eda6).

Hope this is ready to merge now, if the CI tests pass.

mrhardman commented 10 months ago

I'll test this and report back. I think I am happy apart from the vperp bc option. My experience was that d F / d vperp =0 has to be imposed if there was any diffusion in vperp, numerical (d^2 / d vperp^2) or otherwise. We don't have an advection term yet so we can't test that, but I would want to 1) force the regularity condition if there is any vperp diffusion (a vperp_diffusion flag?) 2) give a warning if the regularity condition is not used 3) tell the user not to use Chebyshev grid for vperp unless it is an constant of the motion, until d F / d vperp = 0 can be imposed with Chebyshev grid.

mrhardman commented 10 months ago

Now happy with the suggested changes for the vperp BC. Does this PR have all the latest commits from #149? If so, I will do some relaxation and sheath tests to make sure everything works and we can finish the merge.

johnomotani commented 10 months ago

Does this PR have all the latest commits from #149?

Yes, should do.

johnomotani commented 10 months ago

Ah ****, but I've introduced a bug in the input checking. Will have a look at that in a little bit...

mrhardman commented 10 months ago

Does this PR have all the latest commits from #149?

Yes, should do.

I cannot see https://github.com/mabarnes/moment_kinetics/pull/149/commits/6045a81c648a06b674faba11bb77a5da4bb95ba2. (EDIT: Perhaps because I am looking online through the commit history instead of the grep-able git log EDIT EDIT: Indeed, using git log locally I can find the commit!).

mrhardman commented 10 months ago

From the way the test is failing I suspect that the boundary conditions are not being applied properly in the latest commits. The right options seem to be chosen in moment_kinetics_input.jl.

johnomotani commented 10 months ago

I think the test failure might be because there was a bug before, which affected the expected data. Before bf8eda6, we expected the vperp.bc setting not to be used at all. However, reconcile_element_boundaries_centered!() is called from derivative_elements_to_full_grid!(), from derivative!(), etc. and reconcile_element_boundaries_centered!() behaves differently if the coord.bc is "periodic" or something else in how it treats the end points. Previously vperp.bc had been set to "periodic", and I think this was incorrect.

PS I found this by setting vperp.bc = missing (which required monkeying around with a couple of struct definitions), but helps find the problem because "string" == missing returns missing, which causes an error when a Bool was expected. Nice Julia feature!

johnomotani commented 10 months ago

I guess the bug shouldn't make much difference in well-resolved examples, because the derivative both at vperp=0 and vperp=L should be (approximately?) zero, so replacing each with the average of the two shouldn't make much difference?

johnomotani commented 10 months ago

@mrhardman should i just update the expected data for the fokker_planck_time_evolution_tests.jl case?

mrhardman commented 10 months ago

@mrhardman should i just update the expected data for the fokker_planck_time_evolution_tests.jl case?

I don't think so yet. To fully convince ourselves that all is well we should run the example case fokker_planck_relaxation.toml to a time t = 1000 / nuii and see that the Maxwellian is stable.

However, reconcile_element_boundaries_centered!() is called from derivative_elements_to_full_grid!(), from derivative!(), etc. and reconcile_element_boundaries_centered!() behaves differently if the coord.bc is "periodic" or something else in how it treats the end points. Previously vperp.bc had been set to "periodic", and I think this was incorrect.

The only place these routines were used for the vperp dimension would have been in the calculation of the boundary data in the Rosenbluth potentials. It is conceivable that the mistake could have made a small difference. The only way to check is to change the boundary condition in my PR #149 and see if this breaks the test. I suspect it will not. Let me report back.

johnomotani commented 10 months ago

Ah sorry, should have added: using the enforce_vperp_boundary_condition!() implementation from before bf8eda6, (which ignores vperp.bc) and setting vperp.bc = "periodic" the fokker_planck_time_evolution_tests.jl test passes with the code otherwise at the latest version of this PR (7d5d1c5).

mrhardman commented 10 months ago

Ah sorry, should have added: using the enforce_vperp_boundary_condition!() implementation from before bf8eda6, (which ignores vperp.bc) and setting vperp.bc = "periodic" the fokker_planck_time_evolution_tests.jl test passes with the code otherwise at the latest version of this PR (7d5d1c5).

Then I am confused, I thought you were arguing that the boundary conditions were imposed correctly with vperp.bc = "zero" in the current revision and that the error came from the application of the derivative! function? It sounds like you have localised the error to the enforce_verp_boundary_condition!() function?

johnomotani commented 10 months ago

No, I'm saying that the updated vperp boundary condition is fine (I only reverted to the old one, because the new one would throw an error saying vperp.bc = "periodic" is not supported when vperp.bc = "periodic" is set), and that setting vperp.bc = "periodic" 'passes' the test because it reproduces the behaviour of the code that the expected data was generated with. But the old behaviour when vperp.bc = "periodic" is incorrect. I think if you set vperp.bc = "zero" (or anything except "periodic" on the #149 branch, then the fokker_planck_time_evolution_tests.jl would fail [EDIT: indeed it does fail, with the exact same errors (at least I checked the phi error) as on this branch].

mrhardman commented 10 months ago

Trying to make this investigation, I hit a post processing error:

julia --project -O3 run_post_processing.jl runs/fokker-planck-relaxation Activating project at ~/excalibur/moment_kinetics_merge_collisions ┌ Warning: No working GUI backend found for matplotlib └ @ PyPlot ~/.julia/packages/PyPlot/2MlrT/src/init.jl:153 ERROR: LoadError: Unsupported number of dimensions: 4 Stacktrace: [1] error(s::String) @ Base ./error.jl:35 [2] read_distributed_zwallr_data!(var::Array{Float64, 4}, var_name::String, run_names::Tuple{String}, file_key::String, nblocks::Tuple{Int64}, nr_local::Int64, iskip::Int64, wallopt::String) @ moment_kinetics.post_processing ~/excalibur/moment_kinetics_merge_collisions/src/post_processing.jl:257 [3] #10 @ ./none:0 [inlined] [4] iterate @ ./generator.jl:47 [inlined] [5] collect @ ./array.jl:782 [inlined] [6] _totuple @ ./tuple.jl:401 [inlined] [7] Tuple(itr::Base.Generator{Tuple{Tuple{Array{Float64, 4}, String, Tuple{String}, String, Tuple{Int64}, Int64, Int64, String}}, moment_kinetics.post_processing.var"#10#20"{typeof(moment_kinetics.post_processing.read_distributed_zwallr_data!)}}) @ Base ./tuple.jl:369 [8] get_tuple_of_return_values(::Function, ::Tuple{Array{Float64, 4}}, ::Vararg{Any}) @ moment_kinetics.post_processing ~/excalibur/moment_kinetics_merge_collisions/src/post_processing.jl:435 [9] analyze_and_plot_data(prefix::String; run_index::Nothing) @ moment_kinetics.post_processing ~/excalibur/moment_kinetics_merge_collisions/src/post_processing.jl:747 [10] analyze_and_plot_data(prefix::String) @ moment_kinetics.post_processing ~/excalibur/moment_kinetics_merge_collisions/src/post_processing.jl:470 [11] top-level scope @ ~/excalibur/moment_kinetics_merge_collisions/run_post_processing.jl:7

johnomotani commented 10 months ago

Trying to make this investigation, I hit a post processing error:

Fixed now.

mrhardman commented 10 months ago

@mrhardman should i just update the expected data for the fokker_planck_time_evolution_tests.jl case?

I have done the checks described above and I am happy for you to revise the test with the new vperp.bc = "zero" default boundary condition. I think this should permit the merge with PR #149.

johnomotani commented 10 months ago

I've added a case with the collision operator to the automated debug checks. Hopefully this should help catch any shared-memory errors that we introduce in future!

mrhardman commented 10 months ago

I've added a case with the collision operator to the automated debug checks. Hopefully this should help catch any shared-memory errors that we introduce in future!

Nice work! This would be super useful for looking improving the parallelisation.

It looks like this test simulates a "sheath scenario". Do we want to develop a cheap check-in test along these lines (there's an example in the discussion of PR #149)?

My experiments running the sheath simulations suggest that the "low hanging fruit" of parallelising over z r s in the collision operator loop (https://github.com/mabarnes/moment_kinetics/issues/140) would give a speedup of up to z.ngrid * r.ngrid compared to a distributed memory run with z_nelement_local = 1 and r_nelement_local = 1, depending on the number of cores available.

johnomotani commented 10 months ago

It looks like this test simulates a "sheath scenario". Do we want to develop a cheap check-in test along these lines (there's an example in the discussion of PR #149)?

The input was a copy of examples/fokker-planck/fokker-planck-relaxation.toml with the resolution cut way down.

I'm always in favour of more tests, is the 1D2V simulation quick enough to run as an automated test?

mrhardman commented 10 months ago

I'm always in favour of more tests, is the 1D2V simulation quick enough to run as an automated test?

I think yes, if we can cut down the velocity resolution and still get a physical looking solution. If we can use the same velocity resolutions as the current relaxation test, but just add a z domain, the runtime will scale roughly with the number of z points compared to the runtime of the current test. I'll make an experiment and post the plots here.

EDIT: It looks like using very reduced resolutions comparable to those in the relaxation test in sheath simulations leads to an unstable simulations in the present commit (and probably earlier ones too). Further work required.

johnomotani commented 10 months ago

The debug checks are taking a super long time now. I've got a 'fix' for that (it at least gets them back down to about 2 hrs...), but I'll add that in a separate PR. I think this might as well merge into PR #149 now. @mrhardman if there are any last tweak to add, we can put them on that branch.