wacl-york / mcm-web

Code for the MCM web application
1 stars 2 forks source link

Exporting to MUSICA configuration #337

Open K20shores opened 1 month ago

K20shores commented 1 month ago

Hi @stulacy

I'm a software developer in NCAR's ACOM lab on the MUSICA project. We have much interest in running the MCM in our box model called music box.

We read json files that define the mechanism used to run our box model. Rather than providing equations compiled into code, our format defines the data needed to run the mechanism. For instance, an arrhenius reaction would be defined like this:

                {
                    "type": "ARRHENIUS",
                    "A": 8.018e-17,
                    "reactants": {
                        "O": {},
                        "O2": {}
                    },
                    "products": {
                        "O3": {}
                    },
                    "MUSICA name": "R2"
                },

I see that you have a database file which seems to define all of your reactions. It seems that the rates are defined as equations (e.g., 2.20D-13*KMT06*EXP(600/TEMP)) rather than as data.

Is there possibly a separate database which contains only the data without the equations written out? If not, would you happen to have any documentation which explains your database schema? We are interested in creating a valid musica mechanism that would allow the MCM to be run in music box and having the reaction data or understanding your schema would greatly help us do so.

stulacy commented 1 month ago

Hi @K20shores , this looks like a really interesting project and we'd be happy to help integrate the MCM.

We unfortunately don't currently have any publicly accessibly documentation regarding the database schema, but if you don't mind getting your hands dirty a little you can run .schema in the SQLite REPL to view the full schema. Or you can run .schema <table> for a specific table.

As an example then, the main table is Reactions, which stores all the reactions in the MCM (and the CRI). This has a field called Rate, where yes our rates are stored as plain text rather than as discrete parameters. This is to allow easy integration with modelling software that can directly parse these rates. This field is an FK to the Rates table.

There are 17,224 reactions in the MCM.

SELECT COUNT(DISTINCT ReactioNID) FROM Reactions WHERE Mechanism = 'MCM';
17224

which correspond to 2,955 unique rate constants.

SELECT COUNT(DISTINCT RATE) FROM Reactions INNER JOIN Rates USING(Rate) WHERE Mechanism = 'MCM';
2955

Looking at the schema for Rates shows that this has an FK RateType in the RateTypes table. The vast majority of these don't have a rate type (we could probably have put a default value here for explicitness' sake) while the other 2 types are Tokenized and Photolysis.

SELECT RateType, COUNT(*) FROM Rates GROUP BY RateType;
RateType    COUNT(*)
----------  --------
NULL        2705    
Photolysis  92      
Tokenized   462     

The rates that don't have a rate type are a mixture of what would be Arrhenius in your taxonomy (3.10D-12*EXP(340/TEMP)*0.2), or first-order loss (2.23D-13) and possibly others.

The Photolysis rates are parameterised as described on the website, with these parameters defined in PhotolysisRates and PhotolysisParameter tables.

The Tokenized rates are what we refer to on the website as simple/generic rates and complex rates. These are rates that are defined hierarchically, and I think encompass the ternary and troe groups that you use.

Let me know if you need anything explaining in more detail.

K20shores commented 1 month ago

Hi @stulacy thank you for that. Several of the reactions have the @ symbol in them, like this one: 5.6D-34*N2*(TEMP/300)@(-2.6)*O2

Is @ used to indicate a power here so that the rate would be this?

$$ 5.6\cdot10^{-34} \cdot [\mathbf{N2}] (\frac{\mathbf{TEMP}}{300})^{-2.6} \cdot [\mathbf{O2}] $$

This code leads me to believe this is true, but the comment seems to suggest that there's another case when what's in the parentheses is not only a number

https://github.com/wacl-york/mcm-web/blob/5c9b29445672ea5e92f226118ba928edc9311b4a/helpers/helpers.rb#L28-L43

Also, just out of curiosity, is there a rhyme or reason for the J values? For example, J<9> doesn't appear to exist

K20shores commented 1 month ago

Ah, I see further down about the expression inside of an @ expression.

stulacy commented 1 month ago

Exactly that regarding the @. The rates are written in a FACSIMILE compatible expression, the Technical Reference has full details of the syntax.

There was a reason for the photolysis rates being non-sequential, but it was before my time so I'm afraid I can only clock it up to 'historical reasons'.