NOAA-EMC / fv3atm

Other
29 stars 154 forks source link

Bug in GFS_typedefs.F90? Should Model%nto and Model%nto2 always be initialized? #870

Open climbfuji opened 6 days ago

climbfuji commented 6 days ago

See https://github.com/NOAA-EMC/fv3atm/blob/a9364591091c836984a40107729720705847c195/ccpp/data/GFS_typedefs.F90#L5181

Model%nto and Model%nto2 are only initialized if the CPP directive MULTI_GASES is set. But these variables always exist (see lines 1490-1491) and therefore remain uninitialized.

In lines 5398-5399, they are used to set up the label_dtend_tracer array if ldiag3d is true. Since ldiag3d can be true without MULTI_GASES being set, it looks like uninitialized values of nto and nto2 are passed to that routine. Unless I am not seeing it. Depending on the system you are on, this can have bad consequences. Example: on my laptop and on Nautilus (a DoD Penguin Linux system), they both always have value zero. That results in double free memory corruption errors later in the add_dtend calls. On Narwhal (a DoD Cray system), they have very large but different integer values that result in out of bounds errors. This is with NEPTUNE and not with the UFS, but if I read the code in GFS_typedefs correctly, the bug seems to be there, too?

climbfuji commented 6 days ago

@DusanJovic-NOAA @SamuelTrahanNOAA I'd appreciate if you could take a look at this. If what I am saying above is correct, then I don't understand why you are not getting these double free or out of bounds errors. Thanks!

SamuelTrahanNOAA commented 6 days ago

I've seen other problems in the past with uninitialized indices from not calling get_tracer_index.

The nto and nto2 should always be initialized to something. I'd say zero is a good value if no tracers of that type exist.

EDIT: This is incorrect. See comments that follow.

climbfuji commented 6 days ago

If you call get_tracer_index for them (as I think you should), then they get value NO_TRACERS = -99, which then, together with your label_dtend_tracer logic, means they are skipped (index + 100 --> if <2 ignore) ?

SamuelTrahanNOAA commented 5 days ago

Yes, my apologies, you are right. The correct fix is to always call get_tracer_index for those variables.