loganpknudsen / BottomBoundaryLayer

0 stars 0 forks source link

Some miscellaneous feedback #3

Open wenegrat opened 6 months ago

wenegrat commented 6 months ago

https://github.com/loganpknudsen/BottomBoundaryLayer/blob/bc0f5945d47c88e71a5ee7d95afd7557c79e332e/BBL_with_oscillations_code_GPU_check.jl#L57

This is a large velocity. You could very easily reduce this by a factor of 5 (ie. make it 10 cm/s) and have a realistic DWBC type flow.

https://github.com/loganpknudsen/BottomBoundaryLayer/blob/bc0f5945d47c88e71a5ee7d95afd7557c79e332e/BBL_with_oscillations_code_GPU_check.jl#L59

Likewise this is a very strong stratification (I assume you chose this because your velocity was so large). Even an order of magnitude smaller would be representative of a strong pycnocline, whereas a deep ocean value would be 3 orders of magnitude smaller (10^{-7}).

https://github.com/loganpknudsen/BottomBoundaryLayer/blob/bc0f5945d47c88e71a5ee7d95afd7557c79e332e/BBL_with_oscillations_code_GPU_check.jl#L121

I would look at the output and make sure that after the first handful of timesteps the model is using the max $\Delta t$. My expectation is that it should be (since I think everything should be laminar in this setup).

https://github.com/loganpknudsen/BottomBoundaryLayer/blob/bc0f5945d47c88e71a5ee7d95afd7557c79e332e/BBL_with_oscillations_code_GPU_check.jl#L73

This line (along with the 2 that follow, involving sin and cos) might be culprits for the slow down, as I believe I recall Tomas mentioning the GPU doesn't like trig functions.

The way I would probably debug this is (making sure at each step it runs fast)

  1. Turn off the background fields except for linear stratification and run.
  2. Turn on a barotropic flow.
  3. Turn on the arrested Ekman solution.
  4. Turn on the oscillation fields.

Let me know what you find!

wenegrat commented 6 months ago

Ok, probably not the sin/cos functions since Victoria and I both use those also.

Take a look at this link and note that you want to use const whenever possible (doesn't appear you are doing this currently). https://github.com/CliMA/Oceananigans.jl/blob/main/docs/src/simulation_tips.md#global-variables-that-need-to-be-used-in-gpu-computations-need-to-be-defined-as-constants-or-passed-as-parameters

tomchor commented 6 months ago

https://github.com/loganpknudsen/BottomBoundaryLayer/blob/bc0f5945d47c88e71a5ee7d95afd7557c79e332e/BBL_with_oscillations_code_GPU_check.jl#L121

I would look at the output and make sure that after the first handful of timesteps the model is using the max Δt. My expectation is that it should be (since I think everything should be laminar in this setup).

Just a quick question here: where does the 10sec max_Δt come from? I think the max timestep should be connected to some physical time-scale of the flow, no? Like 1/N (N being the stratification). In which case it shouldn't be hardcoded.

wenegrat commented 6 months ago

It probably comes from a suggestion of mine that my simulations (in the laminar phase) take timesteps of O(10 seconds). I believe @loganpknudsen has tracked this down to the use of very large interior velocities and the fact that the CFL condition now includes those in the calculation. Actually perhaps they shouldn't do so in the case of a background flow in a 'flat' dimension, but not a big deal.

tomchor commented 6 months ago

Actually perhaps they shouldn't do so in the case of a background flow in a 'flat' dimension, but not a big deal.

Yeah I agree they shouldn't. I remember seeing some discussion about the CFL calculation a few months ago which is probably where they changed it. It might be worth raising an issue at some point since it's an easy fix.

tomchor commented 6 months ago

Also, just reinforcing what I had already mentioned to @loganpknudsen before: I think the strategy of progressively turning off things from the code to see where the slowdown is coming from is probably the way to go if you haven't yet managed to speed up the code.

tomchor commented 6 months ago

Also², @loganpknudsen last we talked you were gonna time how long each time-step was taking on average (which you can do either using Oceanostics or just letting the whole thing run and dividing by the number of timesteps). Did you manage to do it?

For reference, my very complex headland simulations running on 100 million grid points take about 1.7 seconds per time step on an A100 GPU and about 2.5 seconds on V100s.

Related to that: like @wenegrat said, your velocity was pretty large, likely contributing to a small Δt, in which case the slowdown is "physical". What's the average Δt in your simulations? (Or I guess a better question: how does it evolve?)

loganpknudsen commented 3 months ago

I will get some answers on this once I get things running again, sorry just saw this. I can try switching to the A100 gpu