Open msl3v opened 1 year ago
Hi @msl3v @RolfSander @JimmieLin: Am wondering if the best thing is to pass a vector of logicals where the prod/loss species would be true and everything else false. We could initialize that outside of the loop over grid boxes.
!~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
KPP_REAL FUNCTION ros_ErrorNorm ( Y, Ynew, Yerr, IsDummySpc, &
AbsTol, RelTol, VectorTol )
!~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
!~~~> Computes the "scaled norm" of the error vector Yerr
!~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
IMPLICIT NONE
! Input arguments
KPP_REAL, INTENT(IN) :: Y(N), Ynew(N), &
Yerr(N), IsDummySpc(N), AbsTol(N), RelTol(N)
LOGICAL, INTENT(IN) :: VectorTol
! Local variables
KPP_REAL :: Err, Scale, Ymax
INTEGER :: i
KPP_REAL, PARAMETER :: ZERO = 0.0_dp
Err = ZERO
DO i=1,N
IF ( isDummySpc(N) ) CYCLE
Ymax = MAX(ABS(Y(i)),ABS(Ynew(i)))
IF (VectorTol) THEN
Scale = AbsTol(i)+RelTol(i)*Ymax
ELSE
Scale = AbsTol(1)+RelTol(1)*Ymax
END IF
Err = Err+(Yerr(i)/Scale)**2
END DO
Err = SQRT(Err/N)
ros_ErrorNorm = MAX(Err,1.0d-10)
END FUNCTION ros_ErrorNorm
I think we'd need to preprocess an index of actual species. Note at the end of the ros_ErrorNorm() function, that Err is divided by N
Thanks @msl3v. I think the number of prod/loss species are given by NFam, right? Then it would be divided by N-Nfam
@yantosca I think you're correct! Alright. This should be an easy fix. Where should it go? Dev? Or would it be better to create a new branch and create a PR?
I think a new branch would be best. Then you have time to test it independently while other things may happen in the dev branch.
Maybe you can use a Fortran90 mask and a WHERE block for this. I haven't tested it, but maybe it could look like this:
LOGICAL :: notfam(N)
KPP_REAL :: Err(N)
WHERE (notfam)
Err = (Yerr/Scale)**2
END WHERE
Err = SUM(Err, notfam) / (N-Nfam)
Here, the mask notfam
is the vector of logicals where the prod/loss
species would be false.
So, whether it's a logical vector or integer vector, it makes sense to pre-process it into the code rather than have KPP build it each call. My inclination is to pre-process a separate vector of species index values and have the function loop over that. It will avoid unnecessary branching. There would be no need for an IF() statement. While I'm not really sure that this would be more efficient, ErrorNorm gets called every internal time step. e.g. declare it it INDX(N-NFam)
DO i=1,N-NFam
Ymax = MAX(ABS(Y(INDX(i))),ABS(Ynew(INDX(i))))
IF (VectorTol) THEN
Scale = AbsTol(INDX(i))+RelTol(INDX(i))*Ymax
ELSE
Scale = AbsTol(1)+RelTol(1)*Ymax
END IF
Err = Err+(Yerr(INDX(i))/Scale)**2
END DO
Or is this unnecessary complexity for marginal improvement? Thoughts appreciated.
Created a new branch, FamilyIndexTesting. As usual, bad with naming things. Made a commit, 7d9c53787ea33f228d5e7f81a20aaa8995af63a9 What's completed is the generation of the vector of non-prod/loss species index values, its declaration and inclusion in *_Parameters.F90
Please have a look @yantosca @RolfSander @jimmielin
I tested this by manually changing the ros_ErrorNorm() function in a boxmodel I use, and the number of internal steps decreased from 19 to 17 for the given GEOS-Chem scenario I tested. So, an improvement. Some of the species (mainly the super-stiff iodine radicals) changed some, but that's not surprising. Nothing glaring. Still converged. Results still made sense.
There's an alternative solution to this issue here: that family terms are weighing down the error norm.
We were looking at the error in some detail, and realized that the reason the relative error for the family terms is large is because they are initialized to zero.
A kludgy fix would/could simply be to initialize all P/L terms to a large value, and just use finite difference to get their true value after the integration is complete.
The benefit would be that we minimize additional complexity within KPP. <-- I feel this is not unimportant. As KPP is a community project now, additional complexity could hinder development and/or increase the likelihood of bugs.
It also sets a precedent that says "keep piling things on" when the benefit may be marginal and even localized to only a few users' cases.
I also don't like making changes that touch all of the integrator template files. Though in the case here, this could be resolved if, say, the ErrorNorm() functions were put in a pre-processed module file like Util. or LinearAlgebra.
This is more of a philosophical comment. But I hope for some feedback.
I'm not very familiar^ with the family code, so I restrict myself to a philosophical comment.
I'm a big fan of piling lots of new features onto KPP, as long as they are not switched on by default.
Often, it is fine to add a few IF (newfeature) THEN
blocks to the
code. If there are too many IF
blocks that slow down the code, it may
be necessary to deactivate the new features via CPP preprocessor
directives. Then the new code can be excluded from the compiled
executable.
^ no pun intended :-)
Using CPP to toggle the error norm calculation makes sense.
OK. If piling-it-on is the path, here's three approaches to this issue: 1) Do as done already. Create a new vector and modify the error norm calcs in the various integrator files. Possibly bounded by a CPP tag. The vector could be logical or integer. 2) Create a "#PAD" toggle, that pads any prod/loss or dummy species by a sufficiently large value to minimize the relative error for that species. 3) Simply set a high ABSTOL for the prod/loss or dummy species that minimizes the relative error for that species, and use vector tolerances. Similar to (2), uses a toggle in the *.kpp file.
Hi all. I think it may be cleaner to use the toggle in the *.kpp file, as then you wouldn't have to burden users with having to compile in a C-preprocessor switch setting.
Wrote a simple program to test IF() statement overhead vs. the cost of FLOPs in computation of the error norm (see below). It executes two do-loops: in the first, a conditional is applied. If .true., then perform the error norm calc. Otherwise, nothing. The second loop simply computes the full vector regardless.
The IF() loop tests a random vector between 0 and 1.
IF (arr(j) .gt. X) then ... compute.
With gfortran (no optimization, i.e. no "-O" flag), it suggests that above X = 0.6, there is a speed advantage in using an IF() statement. Otherwise, it is considerably faster to just compute all terms.
With the "-O" flag, there is no difference at any of the range of tested X values (0.05 thru 0.95). So compiler optimization is really pretty effective.
Anyway - carry on.
implicit none
integer :: itr, i, j, nitr
integer, parameter :: arrsize = 500
real :: tstart, tend, iftime, str8time
real :: arr(arrsize), atol(arrsize), rtol(arrsize), c(arrsize), cnew(arrsize)
call RANDOM_NUMBER(arr)
call RANDOM_NUMBER(atol)
call RANDOM_NUMBER(rtol)
call RANDOM_NUMBER(c)
! Number of iterations to perform. Larger value increases the runtime
nitr = 100000
! Go ahead and run a loop just to prime the memory
do i=1,nitr
do j=1,size(arr)
if (arr(j) .gt. 0.25) then
cnew(j) = atol(j)+rtol(j)*c(j)
endif
enddo
enddo
! Run the loop with the IF statement
call cpu_time(tstart)
do i=1,nitr
do j=1,size(arr)
if (arr(j) .gt. 0.5) then
cnew(j) = atol(j)+rtol(j)*c(j)
endif
enddo
enddo
call cpu_time(tend)
iftime = tend-tstart ! Time spent in the IF loop
write(*,*) 'IF time = ', iftime
! Run the loop just str8 throught the vector ops
call cpu_time(tstart)
do i=1,nitr
do j=1,size(arr)
cnew(j) = atol(j)+rtol(j)*c(j)
enddo
enddo
call cpu_time(tend)
str8time = tend-tstart ! Time spent just plowing straight through the vectors
write(*,*) 'STR8 time = ', str8time
write(*,*) 'STR8/IF = ', str8time/iftime
end program main
Hello Mike,
Thanks for your code development activities! I'm sorry that I'm currently busy with other projects and haven't been able to contribute much here. However, if you have any questions specifically for me, let me know!
It's nice to see that IF
statements don't slow down the optimized code.
That's quite reassuring, especially since we have added a couple of
IF
statements deep inside the loops for Do_Update*
IF ( Do_Update_SUN ) CALL Update_SUN()
IF ( Do_Update_RCONST ) CALL Update_RCONST()
@RolfSander @msl3v IF statements are pretty cheap, if they don't have an ELSE. For this reason I've started refactoring IF/ELSE blocks in my other projects to
A = .FALSE.
IF ( B ) A =.TRUE.
as the assignment an IF are likely cheaper than an IF/ELSE. The issue is that the code has to "guess" which branch is being taken (the IF or the ELSE) at the chip level and if it guesses wrong it has to dump and reload cache instructions and data.
Interesting, thanks for the explanation, @yantosca!
Every day is a school day...
@msl3v @RolfSander @obin1: Are there any updates on this feature? Would this be ready to include in the dev branch?
I only developed it as far as a successful test of concept. Not sure the best way to proceed. Using the auto-reduce boxmodel wouldn't work since the C() and RCONST() vals are hard-coded by reaction index.
Is this something you/GCST/GCSC want to see done?
On 4/14/23 10:29, Bob Yantosca wrote:
@msl3v https://github.com/msl3v @RolfSander https://github.com/RolfSander @obin1 https://github.com/obin1: Are there any updates on this feature? Would this be ready to include in the dev branch?
— Reply to this email directly, view it on GitHub https://github.com/KineticPreProcessor/KPP/issues/66#issuecomment-1508651975, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVE237TTEPSXPCZNHJG7ELXBFNOLANCNFSM6AAAAAASE5AO3Y. You are receiving this because you were mentioned.Message ID: @.***>
@msl3v: only as much as we don't want family species messing around with the integration. Not really a priority though.
Here's a visualization from a project @msl3v and I've been working on: giving prod/loss species arbitrarily large absolute tolerances of 1e25 (effectively removing them from the error norm) reduces the number of integration steps. This can also slightly alter the concentrations of species whose tolerances we don't modify, by changing the internal timesteps.
Interesting find @obin1! That might be the best workaround. When I have time later I might open a feature request on the GEOS-Chem repo so that we initialize P/L species to 1e25.
I just noticed that https://github.com/geoschem/geos-chem/pull/1334 is put on pause because of the P/L species being considered in the error norm. I'm linking to it here so we can see these issues are related on GitHub.
@obin1 @RolfSander @jimmielin @msl3v This has been identified as a high-priority item to fix at the IGC11 meeting.
Not the most elegant fix for KPP in general but the quickest fix for GEOS-Chem regarding this issue could be in the scope of GeosCore/fullchem_mod.F90
, e.g. this commit from December 2023 in an older dev version of GEOS-CF. We were thinking about just making the ATOL of this small set of P/L species arbitrarily large, e.g.
LOGICAL :: relax_rtol, relax_atol_PL ! relax KPP tols
...
! Absolute tolerance
ATOL = 1e-2_dp
relax_atol_PL = .TRUE.
IF (relax_atol_PL) THEN
IF (Input_Opt%amIRoot) write(*,*) "Setting ATOL of P/L dummy species to 1e25"
ATOL(ind_LBRO2H) = 1e25_dp
ATOL(ind_LBRO2N) = 1e25_dp
ATOL(ind_LCH4) = 1e25_dp
ATOL(ind_LCO) = 1e25_dp
ATOL(ind_LISOPNO3) = 1e25_dp
ATOL(ind_LISOPOH) = 1e25_dp
ATOL(ind_LNRO2H) = 1e25_dp
ATOL(ind_LNRO2N) = 1e25_dp
ATOL(ind_LOx) = 1e25_dp
ATOL(ind_LTRO2H) = 1e25_dp
ATOL(ind_LTRO2N) = 1e25_dp
ATOL(ind_LXRO2H) = 1e25_dp
ATOL(ind_LXRO2N) = 1e25_dp
ATOL(ind_PCO) = 1e25_dp
ATOL(ind_PH2O2) = 1e25_dp
ATOL(ind_POx) = 1e25_dp
ATOL(ind_PSO4) = 1e25_dp
END IF
This is trivial to resolve. - Set the tols arbitrarily high. - use a logical vectorI like option (1) for simplicity. Option (2) adds a new data structure. Tho maybe all the overhead of (2) could be done at preprocessing. Add a tag to a species that says keep it out of the error norm. Store the info in the stack. We should discuss the best option. I’m on vacation thru June 23 but will monitor email as I can.
Thanks @obin1 @msl3v. Another suggestion is to add entries for ATOL
to the GEOS-Chem species_database.yml
file, which could be read into the State_Chm%SpcData
object (and then have a default ATOL if one is not specified). That would also be easy to implement. I will work on this when I get back. I am also on vacation until June 24th.
I agree, it would be nice to have a data structure internal to KPP that would flag if it could be added to the error norm or not. That will take some more thought but might be worth it in the long run. Open to your thoughts.
Also see related GEOS-Chem issue: https://github.com/geoschem/geos-chem/issues/1753
Update: This issue has been fixed in GEOS-Chem 14.5.0 by externally setting the ATOL values. But we can leave this issue open as it would be nice to eventually have an internal fix for this in KPP. I think we'd need to have some kind of mask array set to 1's or 0's that we can multiply the rate constants by in order to exclude the dummy species (I think @msl3v suggested that before).
We realized today that when putting Prod/Loss terms for families into the mechanism, KPP uses the errors of these terms in the error norm calculation. In one example, the CH4 loss term, LCH4, has a high relative error, suggesting that it imparts a significant impact on KPP's behavior (step acceptance/rejection, internal step size). The prod/loss terms should be entirely passive. They should not be considered in the convergence evaluation. So I would like to implement a fix that allows ros_ErrorNorm() to ignore them in computing Err.
Can y'all think of a best way to do this?