ESMCI / cime

Common Infrastructure for Modeling the Earth
http://esmci.github.io/cime
Other
161 stars 206 forks source link

Increase standard_name length in entry_id_pg.xsd #4625

Closed jtruesdal closed 4 months ago

jtruesdal commented 4 months ago

I'm running into a build-namelist error for my new CCPP routine dadadj which contains a namelist entry with a standard name greater than the current 63 character limit. Existing CCPP standard names can be much longer than the 63 character limit imposed by the schema file. When searching the CCPP documentation I didn't see a length limit for standard names. If there is no standard name max length I would recommend removing the 63 character limit. The one line mod to remove the length check would be:

12 - 12 +

gold2718 commented 4 months ago

This limit was imposed because that is the maximum length of a Fortran symbol and at the time there was (or was contemplated) some use of standard names as Fortran symbols. Did you try the suggested change and do a successful and complete model build?

jtruesdal commented 4 months ago

Thanks for the background, I was wondering where 63 came from. Using the suggested fix, the CAM-SIMA model builds and runs to completion, passing the simple BFB test that is implemented using snapshot files. Just FYI the units_type pattern match defined in entry_id_pg.xsd also does not have a limit so this change would be consistent with another character based type.

gold2718 commented 4 months ago

Thanks for the background, I was wondering where 63 came from. Using the suggested fix, the CAM-SIMA model builds and runs to completion, passing the simple BFB test that is implemented using snapshot files. Just FYI the units_type pattern match defined in entry_id_pg.xsd also does not have a limit so this change would be consistent with another character based type.

Given this, I think this should be noted as a restriction that CAM-SIMA must follow.

jtruesdal commented 4 months ago

Thanks for making me take a look at this. My proposed fix is incorrect. The previous restriction was a string that began with a single character (a-z uppercase or lowercase) followed by 0-63 characters that were [A-Za-z0-9_]. I just want to remove the limit and allow a string of 1 or more characters. (Just to note that currently a standard_name string could be as many as 64 characters since an initial character is required and then up to 63 more are allowed which would be greater than the 63 character limit of a FORTRAN symbol.)

To remove the limit and be consistent with the previous pattern match I need to use a * to allow for 0 or more [A-Za-z0-9_] characters after the initial [A-Za-z] character. So the fix should be

12 +

And the noted restriction on the standard_name would be a string beginning with a-z upper or lower case character followed by 0 or more lowercase or uppercase letters from a to z, or numbers from 0 to 9 or underscore characters.

Does that sound OK?

gold2718 commented 4 months ago

Yes, this looks correct (sorry, missed the plus sign last time). Note that this rule follows the Guidelines for Construction of CF Standard Names which is adopted in the CCPP Standard Name Rules (rule 2).