wacl-york / mcm-web

Code for the MCM web application
1 stars 2 forks source link

KPP export working #177

Closed stulacy closed 8 months ago

stulacy commented 12 months ago

This is a high priority feature and will be worked on once the new release is live. This in theory should just require a few modifications to the code that runs on the old site.

stulacy commented 10 months ago

A basic KPP export was added in #210 , although there are several outstanding issues:

RolfSander commented 10 months ago

Hello Stuart,

Thanks for all your work of implementing the KPP export! I have made a quick test on the new MCM web page, and it works fine.

My next plans are:

I think we can finish these tasks until early next year, maybe January.

RolfSander commented 10 months ago

I just noticed that the MCM home page needs to be updated as well. It currently says:

(currently only FACSIMILE is supported, although KPP will be added in an upcoming release).

stulacy commented 10 months ago

Good spot Rolf, I've updated the home page accordingly.

Your plan sounds sensible, I look forward to working with you to polish off this functionality. Please update this issue when you have an idea for the next development phase.

RolfSander commented 9 months ago

Hello Stuart,

I have now created a minimum working example (MWE) which shows how to use the MCM-generated files for KPP. You can find it in our KPP repository in the mcm branch:

https://github.com/KineticPreProcessor/KPP/tree/mcm/examples/mcm

We have also discussed the most flexible file structure for external input. We now suggest to move all external input out of mcm_isoprene.eqn and into a separate file called constants_mcm.f90. The new file constants_mcm.f90 is a static file that is the same for all extracted mechanisms. It only needs to be adjusted when there are any changes in the MCM Generic Rate Parameters (https://mcm.york.ac.uk/MCM/rates/generic, https://mcm.york.ac.uk/MCM/rates/complex, or https://mcm.york.ac.uk/MCM/rates/photolysis). Ideally, mcm_isoprene.eqn and constants_mcm.f90 should look like this:

https://github.com/KineticPreProcessor/KPP/blob/mcm/examples/mcm/mcm_isoprene.eqn

https://github.com/KineticPreProcessor/KPP/blob/mcm/examples/mcm/constants_mcm.f90

Would you be able to generate mcm_isoprene.eqn automatically in this format with your export function?

With respect to our open questions, the following approaches have been implemented in the MWE:

1) Although previously USE constants and CALL mcm_constants were dysfunctional, the idea behind them is probably still be the best approach. Minor name changes were necessary, and we now only have these two lines inside the F90_RCONST block in mcm_isoprene.eqn:

  #INLINE F90_RCONST
    USE constants_mcm
    CALL define_constants_mcm
  #ENDINLINE {above lines go into the SUBROUTINES UPDATE_RCONST and UPDATE_PHOTO}

The F90_GLOBAL block can be deleted completely from mcm_isoprene.eqn.

The variables M, N2, O2, RO2, H2O and zenith are now all declared in constants_mcm.f90. It is up to the user to assign meaningful values to them (except for RO2 which is already defined).

2) H2O and PROD:

Indeed, we can never be sure if a user has H2O as a chemical species in their mechanism or not. Even though reaction <19> produces H2O, the user may not export these inorganic reactions. Therefore, it is probably best to leave H2O in the rate constants unchanged, i.e., do not convert it to C(ind_H2O). Instead, it is up to the user to define H2O properly. Either they set H2O = C(ind_H2O), or they define it in another way, e.g., H2O = myfunc(humidity).

Reactions <2> and <19> should simply produce PROD.

3) Mapping of photolysis rate constants (j-values)

All solutions discussed so far are problematic. Even if the current MCM numbering remains unchanged in the future, the mapping will be different for different mechanisms. Let's say that a user downloads the mechanism for CH4 first, but later they decide that they want to have isoprene as well. When adding the isoprene reactions, they will have the same index in the J array being used for different photolysis reactions.

We need a solution that works out-of-the-box for the MWE but can be adjusted for improved efficiency, e.g., for global atmospheric chemistry models. I think we can achieve this with named constants in a static file that contains all j values. I have now assigned a name to each j values, you can see the names in constants_mcm.f90. What do you think about this approach?

4) Don't worry about #INCLUDE atoms.kpp. For backward compatibility, I have created a link from atoms to atoms.kpp in the KPP repository.

5) My last point is pretty unimportant, it's just my personal preference: The header section of mcm_isoprene.eqn contains 48 lines which are written as a KPP comment, using the curly braces syntax with {...}. When looking at individual lines, it may not be obvious that we're inside a comment block. Note that KPP can also define comments in preprocessor style: All lines starting with // are treated as comments. Maybe that's a better way to define a large number of comment lines?

stulacy commented 9 months ago

Hi @RolfSander

Thanks for getting back to me with these suggestions, they largely look straight forward to implement and I'm happy to change the block comment notation to the preprocessor style for the header.

Having the complete set of photolysis rates makes sense, but would you mind if I changed the constant names to be the MCM J numbers? I.e. J_56 = 34 rather than J_NOA = 34. This would be much easier to automate on our side, and we can assure that those J numbers don't change in the future.

RolfSander commented 9 months ago

Hello @stulacy,

Thanks for getting back to me with these suggestions, they largely look straight forward to implement and I'm happy to change the block comment notation to the preprocessor style for the header.

Thanks!

Having the complete set of photolysis rates makes sense, but would you mind if I changed the constant names to be the MCM J numbers? I.e. J_56 = 34 rather than J_NOA = 34. This would be much easier to automate on our side, and we can assure that those J numbers don't change in the future.

If you prefer J_56 over J_NOA, that's perfectly fine. Technically (memory usage for the J array etc.) this should make no difference. My idea behind the named constants was that chemists could understand the meaning of J_NOA directly, while looking up J_56 in a table is more of an effort.

Anyway, the only really important point is that the J numbers don't change in the future.

stulacy commented 9 months ago

We can go with the human readable J rates, that's fine.

I've just noticed something else that might be a potential issue. In the constants_mcm.f90 file, the RO2 sum is using the peroxy radicals that are in the isoprene mechanism. However, these peroxies are specific to the submechanism being exported, so should they be included within the global constants file, or within the exported .eqn?

If it was kept within the constants then it should comprise all peroxies within the full MCM, but (from my limited understanding of KPP), that will cause problems when some of the C(ind_x) values aren't in the mechanism. Or am I misunderstanding?

RolfSander commented 9 months ago

We can go with the human readable J rates, that's fine.

Thanks!

In the constants_mcm.f90 file, the RO2 sum is using the peroxy radicals that are in the isoprene mechanism. However, these peroxies are specific to the submechanism being exported, so should they be included within the global constants file, or within the exported .eqn? If it was kept within the constants then it should comprise all peroxies within the full MCM, but (from my limited understanding of KPP), that will cause problems when some of the C(ind_x) values aren't in the mechanism. Or am I misunderstanding?

Good point, thanks for spotting this! Yes, you're right, having unused C(ind_x) would cause problems. The definition of RO2 has to be moved back into the exported .eqn file.

The best place is just before CALL define_constants_mcm inside the F90_RCONST block, i.e.:

#INLINE F90_RCONST
  USE constants_mcm
  RO2 = C(ind_CH3O2) + C(ind_C51O2) + C(ind_CH3COCH2O2) + C(ind_HOCH2CO3) + &
    C(ind_CO2C3CO3) + C(ind_IPRHOCO3) + C(ind_BIACETO2) + C(ind_CH3CO3) + &
    [...]
  CALL define_constants_mcm
#ENDINLINE {above lines go into the SUBROUTINES UPDATE_RCONST and UPDATE_PHOTO}
stulacy commented 9 months ago

I've updated the web-app to generate both mcm_export.eqn and constants_mcm.f90 automatically. Do they look suitable to you @RolfSander ?

RolfSander commented 9 months ago

Hello @stulacy,

Thanks for the updated implementation!

I had to make a few changes to the files, and then I was able to reproduce exactly the same results with my MWE as before. I think all of the issues should be very easy to fix:

1) The letter R is missing in SUBOUTINE:

   END SUBOUTINE define_constants_mcm

2) You have moved the comment about peroxy radicals into the F90_RCONST block. This is good because that's where it should be. However, please note that this block is transfered verbatim into the KPP-generated Fortran files. Therefore, comments must follow Fortran syntax, not KPP syntax. Could you change the lines to:

  ! Peroxy radicals
  ! WARNING: The following species do not have SMILES strings in the database. 
  !          If any of these are peroxy radicals the RO2 sum will be wrong! 
  ! O O3 NO NO2 NO3 O1D N2O5 OH HO2 H2 CO H2O2 HONO HNO3 HO2NO2 SO2 SO3 HSO3 NA
  ! SA CL

3) In constants_mcm.f90, there are now duplicate declarations of KRO2HO2, KRO2NO and KRO2NO3. These need to be removed.

4) You have added the new complex rate constant K16ISOM which depends on HO2, NO and NO3. I think we can treat these species in the same way as H2O. We declare the variables, and if a user needs it, they can assign suitable values themselves. Thus, the line

  REAL(dp) :: M, N2, O2, RO2, H2O, zenith

needs to be changed to:

  REAL(dp) :: M, N2, O2, RO2, H2O, HO2, NO, NO3, zenith

5) Finally, I encountered a problem with the preprocessor style of KPP comments. After some time, I figured out that KPP needs a space after the //, otherwise KPP produces an error. This seems to be a bug in KPP which we need to fix. For now, it would help us if you could add a space after the two slashes in the comments.

stulacy commented 8 months ago

Cheers Rolf. The syntax/typos are straight forward, but issues 3&4 worry me a bit as this is auto-generated from the database and should be correct, which suggests that there's a flaw in my logic somewhere that I'll need to fix. I'm busy today and am only at work for 2 days next week so it might not be until after Christmas that I fix this.

RolfSander commented 8 months ago

No problem, take your time. Let's continue next year...

Have a nice Christmas time!

stulacy commented 8 months ago

Merry Christmas to you too!

stulacy commented 8 months ago

Hi Rolf, I found some time to look at this this morning and the complex rate issue was far simpler than I'd expected (I'd forgotten to take into account the website now hosts both the MCM and the CRI). Have a look at the updated exported files and let me know if they look OK now. constants_mcm.f90 mcm_export.eqn

RolfSander commented 8 months ago

Great, thanks!

The files are (almost) perfect now.

The only remaining issue is O2 in reactions <2> and <19>. We've discussed this earlier, but I forgot to mention it again in my recent posts. The easiest (and my preferred) solution would be to change the products in these reactions to PROD.

However, if you would like to keep the products, the line

O2 = IGNORE ;

would be necessary in the #DEFVAR section. We would then also have to check carefully what we do with the product H2O in reaction <19>.

stulacy commented 8 months ago

Oops I missed the PROD part of your original comment. I've made the change here so eqns 2 & 19 use PROD: mcm_export.eqn

I'm a bit confused with how the species are handled and the difference between defining a species in the mechanism file (with = IGNORE) and defining it as a variable in the Fortran constants_mcm.f90 file. Currently H2O is defined in the mechanism file with H2O = IGNORE ; and used in some reaction rates (e.g. 13), but as H2O rather than C(ind_H2O). It's also initialised in the Fortran code as a REAL and used in a complex rate definition (e.g. KMT06). O2 isn't defined in the mechanism file but is used in reaction rates (e.g. 1), and is also initialised in the constants file and used in complex rates (e.g. KMT18).

Is that correct / as expected?

RolfSander commented 8 months ago

Oops I missed the PROD part of your original comment. I've made the change here so eqns 2 & 19 use PROD: mcm_export.eqn

Thanks! Now it is (completely) perfect :-)

I'm a bit confused with how the species are handled and the difference between defining a species in the mechanism file (with = IGNORE) and defining it as a variable in the Fortran constants_mcm.f90 file. Currently H2O is defined in the mechanism file with H2O = IGNORE ; and used in some reaction rates (e.g. 13), but as H2O rather than C(ind_H2O). It's also initialised in the Fortran code as a REAL and used in a complex rate definition (e.g. KMT06). O2 isn't defined in the mechanism file but is used in reaction rates (e.g. 1), and is also initialised in the constants file and used in complex rates (e.g. KMT18). Is that correct / as expected?

Yes, this can be confusing. It's important to note that KPP species and Fortran variables are completely different things. For most KPP species, there is no Fortran variable with the same name. To access the chemical XYZ in Fortran, you have to use the corresponding element C(ind_XYZ) from the C array.

Only for N2, O2 and H2O, the MCM uses Fortran variables with the same name. They of course refer to the same chemical. However, from a programming point of view, they are not related at all. If a user wants to use the water from KPP as input for humidity-dependent rate constants, they have to define in their Fortran code:

H2O = C(ind_H2O)

Alternatively, they could adopt the humidity from the meteorological part of their model, e.g.:

H2O = myfunc(humidity)

Part of the confusion may result from the fact that a reaction in the *.eqn file contains two different syntaxes: It starts with KPP syntax until the colon, and then it switches to Fortran syntax until the semicolon. For example, O3 in reaction <1> is a KPP species, whereas O2 in the same line is a Fortran variable.

HTH :-)

stulacy commented 8 months ago

Brilliant, I've made those changes live. Feel free to reopen this ticket if you spot anything else!

Also thank you for that explainer, that's helped to solidify my mental model of how the 2 files interact.