CliMA / Oceananigans.jl

🌊 Julia software for fast, friendly, flexible, ocean-flavored fluid dynamics on CPUs and GPUs
https://clima.github.io/OceananigansDocumentation/stable
MIT License
999 stars 196 forks source link

Error only while debugging in VSCode #3171

Closed sophia-wright-blue closed 1 year ago

sophia-wright-blue commented 1 year ago

hello - I'm able to run the basic example in the docs under Quick start, but when I try to run the same code in VSCode by setting breakpoints in debug mode, I get the following error:

the line: model = NonhydrostaticModel(; grid, advection=WENO())

leads to the function update_state!() in the script update_nonhydrostatic_model_state.jl

https://github.com/CliMA/Oceananigans.jl/blob/main/src/Models/NonhydrostaticModels/update_nonhydrostatic_model_state.jl

where it gets to Line 21 - foreach(mask_immersed_field!, model.tracers) after which I get the following error:

the debugger opens up this file:

jl_err_1

which then throws this error:

    @ Oceananigans.Models.NonhydrostaticModels 

~/Oceananigans.jl/src/Models/NonhydrostaticModels/update_nonhydrostatic_model_state.jl:21
 [20] NonhydrostaticModel(; grid::RectilinearGrid{Float64, Periodic, Periodic, Flat, Float64, Float64, Float64, OffsetArrays.OffsetVector{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, OffsetArrays.OffsetVector{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CPU}, clock::Clock{Float64}, advection::WENO{3, Float64, Nothing, Nothing, Nothing, true, Nothing, WENO{2, Float64, Nothing, Nothing, Nothing, true, Nothing, UpwindBiased{1, Float64, Nothing, Nothing, Nothing, Nothing, Centered{1, Float64, Nothing, Nothing, Nothing, Nothing}}, Centered{1, Float64, Nothing, Nothing, Nothing, Nothing}}, Centered{2, Float64, Nothing, Nothing, Nothing, Centered{1, Float64, Nothing, Nothing, Nothing, Nothing}}}, buoyancy::Nothing, coriolis::Nothing, stokes_drift::Nothing, forcing::NamedTuple{(), Tuple{}}, closure::Nothing, boundary_conditions::NamedTuple{(), Tuple{}}, tracers::Tuple{}, timestepper::Symbol, background_fields::NamedTuple{(), Tuple{}}, particles::Nothing, biogeochemistry::Nothing, velocities::Nothing, pressures::Nothing, diffusivity_fields::Nothing, pressure_solver::Nothing, immersed_boundary::Nothing, auxiliary_fields::NamedTuple{(), Tuple{}})
    @ Oceananigans.Models.NonhydrostaticModels 

~/Oceananigans.jl/src/Models/NonhydrostaticModels/nonhydrostatic_model.jl:198

 [21] (::Core.var"#Type##kw")(::NamedTuple{(:grid, :advection), Tuple{RectilinearGrid{Float64, Periodic, Periodic, Flat, Float64, Float64, Float64, OffsetArrays.OffsetVector{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, OffsetArrays.OffsetVector{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CPU}, WENO{3, Float64, Nothing, Nothing, Nothing, true, Nothing, WENO{2, Float64, Nothing, Nothing, Nothing, true, Nothing, UpwindBiased{1, Float64, Nothing, Nothing, Nothing, Nothing, Centered{1, Float64, Nothing, Nothing, Nothing, Nothing}}, Centered{1, Float64, Nothing, Nothing, Nothing, Nothing}}, Centered{2, Float64, Nothing, Nothing, Nothing, Centered{1, Float64, Nothing, Nothing, Nothing, Nothing}}}}}, ::Type{NonhydrostaticModel})
    @ Oceananigans.Models.NonhydrostaticModels 

~/Oceananigans.jl/src/Models/NonhydrostaticModels/nonhydrostatic_model.jl:107
 [22] top-level scope
    @ ~/ocean_ex_1.jl:10

I'm trying to step through the code to understand the code better, but I'm not able to figure out why I only get this error when I insert breakpoints.

I'd greatly appreciate any help understanding this better, thank you

navidcy commented 1 year ago

Hm… I use VS Code all the time … Can you post the output of

using Pkg; Pkg.status()

?

sophia-wright-blue commented 1 year ago

thank you for replying, here is the output from the environment that I run the code in:

julia> Pkg.status()
Status `~/.julia/environments/oc/Project.toml`
  [6e4b80f9] BenchmarkTools v1.3.2
  [336ed68f] CSV v0.10.11
⌅ [052768ef] CUDA v4.3.2
  [13f3f980] CairoMakie v0.10.6
  [8f4d0f93] Conda v1.9.0
  [f68482b8] Cthulhu v2.9.1
  [d58978e5] Dagger v0.17.0
  [a93c6f00] DataFrames v1.5.0
⌃ [2b5f629d] DiffEqBase v6.125.1
  [0c46a032] DifferentialEquations v7.8.0
⌃ [31c24e10] Distributions v0.25.96
  [587475ba] Flux v0.13.17
⌃ [cd3eb016] HTTP v1.9.6
  [0f8b85d8] JSON3 v1.13.1
  [63c18a36] KernelAbstractions v0.9.6
  [da04e1cc] MPI v0.20.11
  [eff96d63] Measurements v2.9.0
  [9e8cae18] Oceananigans v0.84.0 `~/Oceananigans.jl`
  [1dea7af3] OrdinaryDiffEq v6.53.2
  [98572fba] Parquet2 v0.2.17
  [91a5bcdd] Plots v1.38.16
⌃ [438e738f] PyCall v1.95.2
  [3646fa90] ScikitLearn v0.7.0
⌃ [9e226e20] SpeedyWeather v0.2.1
  [90137ffa] StaticArrays v1.5.26
Info Packages marked with ⌃ and ⌅ have new versions available, but those with ⌅ are restricted by compatibility constraints from upgrading. To see why use `status --outdated`

I'm able to set breakpoints and debug the code for all of the other Julia code that I run - I'm also able to run the quick start example when I run the julia script

I only get the error I shared above while setting breakpoints for the quick start example

I have installed Oceananigans in editable/dev mode - the debugger does refer to the installed package - I get the same error even when I install Oceananigans via Pkg.add()

thank you for helping me!

navidcy commented 1 year ago

And you said outside VS Code things seem fine? Can you post the same output there?

sophia-wright-blue commented 1 year ago

it runs as expected even when I run the script in VSCode, the output is:

[ Info: Initializing simulation...
[ Info:     ... simulation initialization complete (96.214 ms)
[ Info: Executing initial time step...
[ Info:     ... initial time step complete (4.980 seconds).
[ Info: Simulation is stopping after running for 6.189 seconds.
[ Info: Model iteration 100 equals or exceeds stop iteration 100.

I only get the error when I insert breakpoints in the quick start example and try to step through the code:

using Oceananigans

grid = RectilinearGrid(size=(128, 128), x=(0, 2Ï€), y=(0, 2Ï€), topology=(Periodic, Periodic, Flat))
model = NonhydrostaticModel(; grid, advection=WENO())

ϵ(x, y, z) = 2rand() - 1
set!(model, u=ϵ, v=ϵ)

simulation = Simulation(model; Δt=0.01, stop_iteration=100)
run!(simulation)

I've shared the exact line that throws the error in my original question

greatly appreciate your help!

navidcy commented 1 year ago

Oh I see. Just to make sure I understand, you are saying:

  1. outside VS code runs OK
  2. inside VS code but no debugger, also runs OK
  3. inside VS code + debugging -> problems Right?

I have never used the debugger of VS Code I must admit... Just want to ensure that 2 is OK but 3 is not.

sophia-wright-blue commented 1 year ago

that's exactly right - 2 is ok, 3 is not

also, 3 is ok for any non-Oceananigans Julia code

using the debugger with VScode on the quick start example should reproduce the error

thank you again!

navidcy commented 1 year ago

hm... OK, I see. Let me try playing around then. Where can I find some info about the VS Code's debugging mode?

But also I wanted to point out that you have a lot of unrelated packages installed in the same environment like Oceananigans (eg, conda, dagger, ScikitLearn, DifferentialEquations,...) Just to ensure that this is not any sort of incompatibility between various deps versions, can you create an empty environment, just have Oceananigans there and try to do 3. and confirm that you get the same issue?

sophia-wright-blue commented 1 year ago

this source has detailed info on using the VSCode debugger with Julia:

https://www.julia-vscode.org/docs/stable/userguide/debugging/

if you use VSCode to run Julia, everything should already be set up

I'll create a new environment and check and get back

thank you again!

navidcy commented 1 year ago

thank you again!

no worries!

sophia-wright-blue commented 1 year ago

I can confirm that I'm getting the same error in a new environment with just Oceananigans installed

looks like this error occurs on the lines of code where a mutable function is called that returns nothing

such as foreach(mask_immersed_field!, model.tracers) ,

fill_halo_regions!(merge(model.velocities, model.tracers), model.clock, fields(model))

here:

https://github.com/CliMA/Oceananigans.jl/blob/main/src/Models/NonhydrostaticModels/update_nonhydrostatic_model_state.jl

navidcy commented 1 year ago

By "mutable" function you mean one that modifies its args? But don't they all return nothing?

Seems you are getting down to the bottom it yourself... Perhaps try writing a simple code of your own then and running the debugger there? E.g.

function my_own_function!(a)
    a = 2a
    return nothing
end

a = 17

my_own_function!(a)

b = 10a

or something like that and see if you get the same error?

sophia-wright-blue commented 1 year ago

the debugger works fine on the code snippet you shared - it also works fine on all other Julia libraries where I step through the code by inserting breakpoints - today is the first time I've gotten this error while trying to step through the Oceananigans codebase by using the quick start example

navidcy commented 1 year ago

OK, I'll try to reproduce it myself then. Not sure if I'll find time to do that today -- I'll let you know!

Anyone else with ideas feel free to chime in.

sophia-wright-blue commented 1 year ago

just as a follow-up in case it's helpful - I spent some time on this - it's the boundary condition "nothing" that seems to make the debugger think it has encountered an exception - the function fill_halo_regions! is a good example of this - I don't know how to resolve this issue though :(

glwagner commented 1 year ago

the debugger works fine on the code snippet you shared - it also works fine on all other Julia libraries where I step through the code by inserting breakpoints - today is the first time I've gotten this error while trying to step through the Oceananigans codebase by using the quick start example

Just a wild guess, but could it have to do with KernelAbstractions? Have you tried the debugger on other packages that use KernelAbstractions?

This smells like a problem with the debugger rather than an Oceananigans-specific issue. But maybe there is something we can change in the source code to help.

sophia-wright-blue commented 1 year ago

thank you for replying - I've never tried the debugger on any package that uses KernelAbstractions

I've opened an issue on the julia-vscode extension github repo (linked above) - I'll open an issue on the KernelAbstractions github repo to get their feedback

thank you for helping me - I'll spend some more time on this

glwagner commented 1 year ago

I guess you're hitting a problem with this function

https://github.com/CliMA/Oceananigans.jl/blob/92791a962c9096746301cdb888a3050e32c4a58b/src/BoundaryConditions/fill_halo_regions.jl#L220-L230

which calls

https://github.com/CliMA/Oceananigans.jl/blob/main/src/BoundaryConditions/fill_halo_regions_nothing.jl

there's a nothing boundary condition because the quick start grid is Flat in the vertical:

grid = RectilinearGrid(size=(128, 128), x=(0, 2Ï€), y=(0, 2Ï€), topology=(Periodic, Periodic, Flat))

@sophia-wright-blue you could try changing the above line to

grid = RectilinearGrid(size=(128, 128, 1), x=(0, 2Ï€), y=(0, 2Ï€), z=(0, 1), topology=(Periodic, Periodic, Bounded))

if you want to specifically check the nothing boundary condition situation.

sophia-wright-blue commented 1 year ago

thank you so much for sharing that @glwagner

after making the change you suggested, I'm getting the exact same error

please let me know if there's any other change I could test - thank you again

sophia-wright-blue commented 1 year ago

I ran through the code step by step after changing the grid as per your suggestion, this time, it is this line of code that threw the boundary_condition_nothing error:

https://github.com/CliMA/Oceananigans.jl/blob/92791a962c9096746301cdb888a3050e32c4a58b/src/Fields/field_tuples.jl#L73

leads to this line:

https://github.com/CliMA/Oceananigans.jl/blob/main/src/Utils/kernel_launching.jl#L97

which then throws the error

glwagner commented 1 year ago

True! You could also try calling fill_halo_regions!(model.velocities) on it's own. Or you can create a tuple yourself and try:

c1 = CenterField(grid)
c2 = CenterField(grid)
tracers = (; c1, c2)
fill_halo_regions!(tracers)

and see if you get an error.

PS if you use the permalinks then the code will be displayed inline in the github issue (like in my comment above --- I changed one of yours to illustrate)

sophia-wright-blue commented 1 year ago

thank you for sharing the code snippet - I get the exact same error even when I run just the snippet you shared

please let me know if there is anything else I can try - thank you again

sophia-wright-blue commented 1 year ago

I'm not sure if this is helpful, but I manually executed this line:

https://github.com/CliMA/Oceananigans.jl/blob/92791a962c9096746301cdb888a3050e32c4a58b/src/Utils/kernel_launching.jl#L97

and then this line:

https://github.com/CliMA/Oceananigans.jl/blob/92791a962c9096746301cdb888a3050e32c4a58b/src/Simulations/run.jl#L122

throws the following error, related to KernelAbstractions , but not related to the value of boundary_conditions:

oc_err_2

looks like the value nothing with functions that use KernelAbstractions seems to trip up the debugger

glwagner commented 1 year ago

Ok, can you reproduce the error by trying to loop over a simple kernel that returns nothing using KernelAbstractions?

sophia-wright-blue commented 1 year ago

sure - I was able to repro the error by using the basic example on the KernelAbstractions quick start page:

https://juliagpu.github.io/KernelAbstractions.jl/dev/quickstart/


using KernelAbstractions

@kernel function mul2_kernel(A)
    I = @index(Global)
    A[I] = 2 * A[I]
  end

dev = CPU()
A = ones(1024, 1024)
ev = mul2_kernel(dev, 64)(A, ndrange=size(A))
synchronize(dev)
all(A .== 2.0)

this is the error

ka_err

the author of the KernelAbstractions library has responded to this issue here and thinks it's the fault of the debugger:

https://github.com/JuliaGPU/KernelAbstractions.jl/issues/405

I'm not sure if that's helpful, but I have only experienced this issue with the KernelAbstractions library and not any other library

I've also opened an issue on the julia-vscode extension repo, but have not received a response yet:

https://github.com/julia-vscode/julia-vscode/issues/3349

please let me know if there's anything else I can try - thank you again for helping me with this issue

glwagner commented 1 year ago

Yes, it seems there is some bad interaction between KernelAbstractions and the debugger. It seems likely that fixing it will requiring some development of the debugger (but possibly also KernelAbstractions). Once that's resolved, I think it's likely the debugger will work with Oceananigans.

Hopefully you can still have a productive development workflow in the meantime! Let us know if there is any way we can help. I think the bright side of this situation is that Julia has a lot of powerful built-in features for code introspection and interactive usage, a lot of which overlap with the features provided by a traditional debugger (as I understand it). Are you planning to contribute to Oceananigans development or just to interact with Oceananigans as a user?

sophia-wright-blue commented 1 year ago

my eventual objective is to understand enough about the models used in Oceananigans and the package to be able to contribute to Oceananigans development :)

glwagner commented 1 year ago

Great! You're already well on your way!

navidcy commented 1 year ago

@sophia-wright-blue is this still outstanding issue?

sophia-wright-blue commented 1 year ago

Hi @navidcy - it is still outstanding, but there's been some progress lately:

https://github.com/JuliaDebug/JuliaInterpreter.jl/issues/574#issuecomment-1674729615

https://github.com/julia-vscode/julia-vscode/issues/3349#event-10200480814

I will update this thread as soon as it's resolved - hope that sounds good - thank you

navidcy commented 1 year ago

Sounds great! I was just wondering -- no rush!

sophia-wright-blue commented 1 year ago

Hi @navidcy - you have great timing - it looks like the most recent release of the julia-vscode extension fixed the issue - I have tested it on a basic Oceananigans script and the debugger seems to be working now

the core issue here was with JuliaInterpreter.jl which was fixed and updated in the julia-vscode extension, which then fixed the issue

thank you so much for your help and patience!

sophia-wright-blue commented 1 year ago

I'm closing this issue with this note in case it helps anyone:

using the julia-vscode debugger is pretty slow - I'm not sure how if this can be resolved in the near term

someone was kind enough to release this repo to debug Pluto notebooks with vscode in case it helps:

https://github.com/disberd/PlutoVSCodeDebugger.jl