Open JanStreffing opened 3 months ago
To me this looks doable. The merge conflicts so far are trivial, the programming seems clear, with most of the code in the three new files. I set the target as 2.6 for now. We can always change that later.
I push the last code pieces into refac_recom repo. REcoM is ready for a merge. Tuning of BGC is on our todo list and will start soon.
Nice job. I tried fixing the merge conflicts. Can you check out and give this a try?
This gcc compiler that we use to the tests is quite strict and it found these:
/__w/fesom2/fesom2/src/int_recom/recom/eos.F90:803:15:
803 | CvT(i) = SGLE(gsw_ct_from_t (DBLE(SA(i)), DBLE(T(i)), DBLE(P(i))))
| 1
Error: Type mismatch in argument 'sa' at (1); passed REAL(16) to REAL(8)
/__w/fesom2/fesom2/src/int_recom/recom/eos.F90:398:20:
398 | SA(i) = SGLE(gsw_sa_from_sp(DBLE(SP(i)), DBLE(P(i)), DBLE(lon(i)), DBLE(lat(i))))
| 1
Error: Type mismatch in argument 'sp' at (1); passed REAL(16) to REAL(8)
/__w/fesom2/fesom2/src/int_recom/recom/eos.F90:402:20:
402 | SA(i) = SGLE(gsw_sa_from_sp(DBLE(SP(i)), 0.0d0, DBLE(lon(i)), DBLE(lat(i))))
| 1
Error: Type mismatch in argument 'sp' at (1); passed REAL(16) to REAL(8)
/__w/fesom2/fesom2/src/int_recom/recom/eos.F90:410:20:
410 | SA(i) = SGLE(gsw_sa_from_sp(DBLE(SP(i)), DBLE(P(i)), def_lon, def_lat))
| 1
Error: Type mismatch in argument 'sp' at (1); passed REAL(16) to REAL(8)
/__w/fesom2/fesom2/src/int_recom/recom/eos.F90:414:20:
414 | SA(i) = SGLE(gsw_sa_from_sp(DBLE(SP(i)), 0.0d0, def_lon, def_lat))
| 1
Error: Type mismatch in argument 'sp' at (1); passed REAL(16) to REAL(8)
/__w/fesom2/fesom2/src/int_recom/recom/eos.F90:701:15:
701 | CvT(i) = SGLE(gsw_ct_from_t (DBLE(SA(i)), DBLE(T(i)), DBLE(P(i))))
| 1
Error: Type mismatch in argument 'sa' at (1); passed REAL(16) to REAL(8)
/__w/fesom2/fesom2/src/int_recom/recom/eos.F90:602:13:
602 | T(i) = SGLE(gsw_t_from_ct (DBLE(SA(i)), DBLE(CvT(i)), DBLE(P(i))))
| 1
Error: Type mismatch in argument 'sa' at (1); passed REAL(16) to REAL(8)
/__w/fesom2/fesom2/src/int_recom/recom/eos.F90:197:20:
197 | SP(i) = SGLE(gsw_sp_from_sa(DBLE(SA(i)),DBLE(P(i)),DBLE(lon(i)),DBLE(lat(i))))
| 1
Error: Type mismatch in argument 'sa' at (1); passed REAL(16) to REAL(8)
/__w/fesom2/fesom2/src/int_recom/recom/eos.F90:201:20:
201 | SP(i) = SGLE(gsw_sp_from_sa(DBLE(SA(i)), 0.0d0, DBLE(lon(i)),DBLE(lat(i))))
| 1
Error: Type mismatch in argument 'sa' at (1); passed REAL(16) to REAL(8)
/__w/fesom2/fesom2/src/int_recom/recom/eos.F90:209:20:
209 | SP(i) = SGLE(gsw_sp_from_sa(DBLE(SA(i)), DBLE(P(i)), def_lon, def_lat))
| 1
Error: Type mismatch in argument 'sa' at (1); passed REAL(16) to REAL(8)
/__w/fesom2/fesom2/src/int_recom/recom/eos.F90:213:20:
213 | SP(i) = SGLE(gsw_sp_from_sa(DBLE(SA(i)), 0.0d0, def_lon, def_lat))
| 1
Error: Type mismatch in argument 'sa' at (1); passed REAL(16) to REAL(8)
/__w/fesom2/fesom2/src/int_recom/recom/eos.F90:512:13:
512 | T(i) = SGLE(gsw_t_from_ct (DBLE(SA(i)), DBLE(CvT(i)), DBLE(P(i))))
| 1
Error: Type mismatch in argument 'sa' at (1); passed REAL(16) to REAL(8)
The intel compiler would probably gloss over these with default settings. Could you have a look at these, @ogurses?
Adding @chrisdane
This is from a third party code. Let me check if there are any updates available.
This gcc compiler that we use to the tests is quite strict and it found these:
/__w/fesom2/fesom2/src/int_recom/recom/eos.F90:803:15: 803 | CvT(i) = SGLE(gsw_ct_from_t (DBLE(SA(i)), DBLE(T(i)), DBLE(P(i)))) | 1 Error: Type mismatch in argument 'sa' at (1); passed REAL(16) to REAL(8) /__w/fesom2/fesom2/src/int_recom/recom/eos.F90:398:20: 398 | SA(i) = SGLE(gsw_sa_from_sp(DBLE(SP(i)), DBLE(P(i)), DBLE(lon(i)), DBLE(lat(i)))) | 1 Error: Type mismatch in argument 'sp' at (1); passed REAL(16) to REAL(8) /__w/fesom2/fesom2/src/int_recom/recom/eos.F90:402:20: 402 | SA(i) = SGLE(gsw_sa_from_sp(DBLE(SP(i)), 0.0d0, DBLE(lon(i)), DBLE(lat(i)))) | 1 Error: Type mismatch in argument 'sp' at (1); passed REAL(16) to REAL(8) /__w/fesom2/fesom2/src/int_recom/recom/eos.F90:410:20: 410 | SA(i) = SGLE(gsw_sa_from_sp(DBLE(SP(i)), DBLE(P(i)), def_lon, def_lat)) | 1 Error: Type mismatch in argument 'sp' at (1); passed REAL(16) to REAL(8) /__w/fesom2/fesom2/src/int_recom/recom/eos.F90:414:20: 414 | SA(i) = SGLE(gsw_sa_from_sp(DBLE(SP(i)), 0.0d0, def_lon, def_lat)) | 1 Error: Type mismatch in argument 'sp' at (1); passed REAL(16) to REAL(8) /__w/fesom2/fesom2/src/int_recom/recom/eos.F90:701:15: 701 | CvT(i) = SGLE(gsw_ct_from_t (DBLE(SA(i)), DBLE(T(i)), DBLE(P(i)))) | 1 Error: Type mismatch in argument 'sa' at (1); passed REAL(16) to REAL(8) /__w/fesom2/fesom2/src/int_recom/recom/eos.F90:602:13: 602 | T(i) = SGLE(gsw_t_from_ct (DBLE(SA(i)), DBLE(CvT(i)), DBLE(P(i)))) | 1 Error: Type mismatch in argument 'sa' at (1); passed REAL(16) to REAL(8) /__w/fesom2/fesom2/src/int_recom/recom/eos.F90:197:20: 197 | SP(i) = SGLE(gsw_sp_from_sa(DBLE(SA(i)),DBLE(P(i)),DBLE(lon(i)),DBLE(lat(i)))) | 1 Error: Type mismatch in argument 'sa' at (1); passed REAL(16) to REAL(8) /__w/fesom2/fesom2/src/int_recom/recom/eos.F90:201:20: 201 | SP(i) = SGLE(gsw_sp_from_sa(DBLE(SA(i)), 0.0d0, DBLE(lon(i)),DBLE(lat(i)))) | 1 Error: Type mismatch in argument 'sa' at (1); passed REAL(16) to REAL(8) /__w/fesom2/fesom2/src/int_recom/recom/eos.F90:209:20: 209 | SP(i) = SGLE(gsw_sp_from_sa(DBLE(SA(i)), DBLE(P(i)), def_lon, def_lat)) | 1 Error: Type mismatch in argument 'sa' at (1); passed REAL(16) to REAL(8) /__w/fesom2/fesom2/src/int_recom/recom/eos.F90:213:20: 213 | SP(i) = SGLE(gsw_sp_from_sa(DBLE(SA(i)), 0.0d0, def_lon, def_lat)) | 1 Error: Type mismatch in argument 'sa' at (1); passed REAL(16) to REAL(8) /__w/fesom2/fesom2/src/int_recom/recom/eos.F90:512:13: 512 | T(i) = SGLE(gsw_t_from_ct (DBLE(SA(i)), DBLE(CvT(i)), DBLE(P(i)))) | 1 Error: Type mismatch in argument 'sa' at (1); passed REAL(16) to REAL(8)
The intel compiler would probably gloss over these with default settings. Could you have a look at these, @ogurses?
Is there any way to download merged code? @JanStreffing @patrickscholz @dsidoren
Why gcc is unhappy with the size of the variables? "singledouble.F90, gsm_mod_kinds.F90" are the files where the byte sizes are prescribed. Those are from 2 different model developments. I assume FESOM uses intrinsic ISO_FORTRAN_ENV.
From my understanding, you are trying to feed in quadruple precision variables into routines that expect their input to come as double precision. Strangely, the solution seems to be there already, you use DBLE
. So now it's complaining that your input to the DBLE
routine is not double, which does not make sense. The DBLE
routine should have interfaces to deal with all sorts of incoming data types.
@ogurses which gcc version do you have? You could try the wonderful option -fallow-argument-mismatch
Ah...sorry, I didn't read all of the preliminary comments. For downloading merged code, since it isn't merged yet, just download refac_recom
that @JanStreffing already updated with his conflict resolutions.
If it's a third-party library, you probably need to write some intermediate interface for it. Something like:
module <third_party_lib_name>_interface
implicit none
interface third_party_sgle
module procedure sgle_quad_precission
end interface
contains
....
subroutine sgle_quad_precission
.... something with type conversion ....
end subroutine
end module <third_party_lib_name>_interface
My FORTRAN is a bit rusty, but something like that. The Chatbot will be able to help 😉
@ogurses which gcc version do you have? You could try the wonderful option
-fallow-argument-mismatch
@koldunov: Can we include this flag into our automatic build tests?
@JanStreffing, see https://github.com/FESOM/fesom2/pull/583/commits/dc4d4635cb995b56ee329b941aa1163267d993b3. I think that should include it at the proper place, but I'm not 100% sure...
Mmmmpf. Well, the flag shows up, but the compiler doesn't like it. I'll look into it a bit more carefully after the lunch break.
@ogurses Would you be willing and able to come by and have a look at this with @patrickscholz and me sometime e.g. this week? Then we can figure out if this can be fixed in the code, or whether we shall change the testcase compile flag.
@JanStreffing I can come over to building H on Friday. I am still under medication and have to stay in bed due to my pneumonia at least 2 more days
The problem seems to be with the GCC version we use to do check runs with. 9.4 is late enough to check for this argument mismatch, but early enough that the flag to turn these into a warning was not yet introduced.
I wonder if we can check with a more modern version.
I believe the change has to be made in this docker container: koldunovn/fesom2_test:fesom2.1
Or the container has to be replaced with one that contains a newer gcc compiler version. I'll see if I can manage that, or if I need @koldunovn to help me out.
Changing the Docker recipe is easy, but it looks like it just builds:
It might be worthwhile to update the base image to a newer ubuntu than 20.04
I can look into both of those issues and automate building the test container with some regularity (i.e. via a GitHub actions cronjob). At the moment the container is hosted on Dockerhub on @koldunovn's account, but I had at some point figured out how to host them on GitHub as well. I'll make a peep here once it's ready.
OK, the new container now uses GCC 13:
-- The C compiler identification is GNU 13.2.0
-- The CXX compiler identification is GNU 13.2.0
Unfortunately there is still the mismatch error, even if flag mentioned above is showing up...
cd /__w/fesom2/fesom2/build/src && /usr/bin/mpifort -DMETISRANDOMSEED=35243 -DMETIS_VERSION=5 -DPARMS -DPART_WEIGHTED -DUSE_PRECISION=2 -D__3Zoo2Det -D__async_icebergs -D__coccos -D__recom -I/usr/include -I/__w/fesom2/fesom2/lib/parms/src/../include -I/__w/fesom2/fesom2/src/async_threads_cpp -I/__w/fesom2/fesom2/build/src/async_threads_cpp -g -O2 -g -ffloat-store -finit-local-zero -finline-functions -fimplicit-none -fdefault-real-8 -ffree-line-length-none -fallow-argument-mismatch -c /__w/fesom2/fesom2/src/int_recom/recom/eos.F90 -o CMakeFiles/fesom.dir/int_recom/recom/eos.F90.o
How strange. We give it the -fallow-argument-mismatch
flag but it seems to be ignored. Compiler bug?
Possibly. I'm trying to get the chatbot to write me a minimal example to narrow it down, but it's failing miserably.
@JanStreffing and I narrowed the error down to a type conversion being optimised away. I pulled apart a lot of the code in eos.f90
responsible for this, but someone should please carefully go over the diffs to make sure I did not have a typo somewhere.
We are getting closer, but now get compiler errors further along in recom:
cd /__w/fesom2/fesom2/build/src && /usr/bin/mpifort -DMETISRANDOMSEED=35243 -DMETIS_VERSION=5 -DPARMS -DPART_WEIGHTED -DUSE_PRECISION=2 -D__3Zoo2Det -D__async_icebergs -D__coccos -D__recom -I/usr/include -I/__w/fesom2/fesom2/lib/parms/src/../include -I/__w/fesom2/fesom2/src/async_threads_cpp -I/__w/fesom2/fesom2/build/src/async_threads_cpp -g -O0 -g -ffloat-store -finit-local-zero -finline-functions -fimplicit-none -fdefault-real-8 -ffree-line-length-none -fallow-argument-mismatch -c /__w/fesom2/fesom2/src/int_recom/recom/rho.F90 -o CMakeFiles/fesom.dir/int_recom/recom/rho.F90.o
/__w/fesom2/fesom2/src/int_recom/recom/rho.F90:145:21:
145 | rhow = 999.842594d0 + 6.793952e-2_r8*X &
| 1
Error: Unexpected derived-type entities in binary intrinsic numeric operator '+' at (1)
/__w/fesom2/fesom2/src/int_recom/recom/rho.F90:155:41:
155 | rho0 = rhow + A*S + B*S*SQRT(S) + C*S**2.0d0
| 1
Error: Unexpected derived-type entities in binary intrinsic numeric operator '**' at (1)
/__w/fesom2/fesom2/src/int_recom/recom/rho.F90:160:49:
160 | Ksbmw = 19652.21d0 + 148.4206d0*X - 2.327105d0*X*X &
| 1
Error: Unexpected derived-type entities in binary intrinsic numeric operator '*' at (1)
/__w/fesom2/fesom2/src/int_recom/recom/rho.F90:160:34:
160 | Ksbmw = 19652.21d0 + 148.4206d0*X - 2.327105d0*X*X &
| 1
Error: Unexpected derived-type entities in binary intrinsic numeric operator '*' at (1)
/__w/fesom2/fesom2/src/int_recom/recom/rho.F90:164:45:
164 | Ksbm0 = Ksbmw + S*( 54.6746d0 - 0.603459d0*X + 1.09987e-2_r8*X**2 &
| 1
Error: Unexpected derived-type entities in binary intrinsic numeric operator '*' at (1)
/__w/fesom2/fesom2/src/int_recom/recom/rho.F90:170:22:
170 | + P*(3.239908d0 + 1.43713e-3_r8*X + 1.16092e-4_r8*X**2 - 5.77905e-7_r8*X**3) &
| 1
Error: Unexpected derived-type entities in binary intrinsic numeric operator '+' at (1)
/__w/fesom2/fesom2/src/int_recom/recom/rho.F90:177:19:
177 | rho_DNAD = rho0/(1.0d0 - P/Ksbm)
| 1
Error: Unexpected derived-type entities in binary intrinsic numeric operator '-' at (1)
Caught it finally :-) -fdefault-real-8
implies promoting double real to 16 bytes unless you also say -fdefault-double-8
. We are left with one more error that someone from ReCOM needs to take care of. Maybe @ogurses or @chrisdane can look into it:
/__w/fesom2/fesom2/src/int_recom/recom_sinking.F90:190:9:
190 | use diff_ver_recom_expl_interface
| 1
Error: 'diff_ver_recom_expl' of module 'diff_ver_recom_expl_interface', imported at (1), is also the name of the current program unit
/__w/fesom2/fesom2/src/int_recom/recom_sinking.F90:289:9:
289 | use ver_sinking_recom_interface
| 1
Error: 'ver_sinking_recom' of module 'ver_sinking_recom_interface', imported at (1), is also the name of the current program unit
/__w/fesom2/fesom2/src/int_recom/recom_sinking.F90:446:4:
446 | if (1) then ! 3rd Order DST Sceheme with flux limiting. This code comes from old recom
| 1
Error: IF clause at (1) requires a scalar LOGICAL expression
/__w/fesom2/fesom2/src/int_recom/recom_sinking.F90:484:4:
484 | if (0) then ! simple upwind
| 1
Error: IF clause at (1) requires a scalar LOGICAL expression
/__w/fesom2/fesom2/src/int_recom/recom_sinking.F90:734:34:
734 | subroutine get_seawater_viscosity(tr_num, tracers, partit, mesh)
| 1
Warning: Name 'diff_ver_recom_expl' at (1) is an ambiguous reference to 'diff_ver_recom_expl' from current program unit
/__w/fesom2/fesom2/src/int_recom/recom_sinking.F90:734:34:
734 | subroutine get_seawater_viscosity(tr_num, tracers, partit, mesh)
| 1
Warning: Name 'ver_sinking_recom' at (1) is an ambiguous reference to 'ver_sinking_recom' from current program unit
/__w/fesom2/fesom2/src/int_recom/recom_sinking.F90:734:34:
734 | subroutine get_seawater_viscosity(tr_num, tracers, partit, mesh)
| 1
Warning: Name 'ver_sinking_recom_benthos' at (1) is an ambiguous reference to 'ver_sinking_recom_benthos' from current program unit
Great work @pgierz
OK, last pieces are finished, but there are few questions to check everything. I changed two internal recom things, and since I do not know recom from the inside, I would be happy for a double check:
There was a circular dependency in the interface part of recom_sinking
Interface missing for bio_fluxes
Additionally, wiso
code had added a case switch to tracers. There was a twice-defined case for CASE 101. I don't know what exactly happens in the tracer code, but if somewhere these code ID numbers are hard-wired to variables, I may have induced some errors. Someone with better knowledge of the insides should check this.
For now, this is just to get an overview how large the changes for RECOM are, and how far refactoring has moved from the branch to point. I would like to see if RECOM is realistic for a 2.6 merge, or will need to be a 2.7 feature. Note, that from @ogurses side, this branch is not finished or merge ready yet. This is only a draft!