atmtools / arts

The Atmospheric Radiative Transfer Simulator
https://www.radiativetransfer.org/
Other
63 stars 28 forks source link

CO2 absorption calculation fails because partition function coeffs are 0 for CO2-737 #350

Closed olemke closed 3 years ago

olemke commented 3 years ago

Describe the bug Since the new isotopologue CO2-737 was added, absorption calculations for CO2 contain NaNs because the partition function coefficients are set to 0.

To Reproduce Steps to reproduce the behavior:

  1. Set abs_species to "CO2"
  2. Make an absorption calculation

Red areas mark NaNs for a calculation with a 750e9 Hz cutoff. Without cutoff, y contains only NaNs. co2nans

The calculation succeeds if CO2-737 is excluded.

If applicable, error output:

Run-time error in controlfile: co2nans.arts
Run-time error in method: yCalc
User Error: failed
Error is found at:
        /Users/olemke/Hacking/uhh/arts/src/m_rte.cc:1556
Please follow these instructions to correct your error:
User Error: std::isnan(y[ii])
Error is found at:
        /Users/olemke/Hacking/uhh/arts/src/rte.cc:2356
Please follow these instructions to correct your error:
One or several NaNs found in *y*.
Stopping ARTS execution.
Goodbye.

Expected behavior Absorption calculation should succeed. The current solution for a full CO2 calculation is excluding CO2-737 explicitly by setting abs_species to "CO2-626, CO2-627, CO2-628, CO2-636, CO2-637, CO2-638, CO2-727, CO2-728, CO2-828, CO2-837, CO2-838". Which is very cumbersome. We should either provide the partition function coefficients for all species or exclude isotopologues with 0 coeffs automatically from the calculation or at least throw an error.

Additional context Same applies for other recently added isotopologues: NO2-656, OCS-634, CS2-*, C2N2-*, COCl2-*

riclarsson commented 3 years ago

Is this with the flag for bad partition functions active? If so, then this is acceptable behavior. Otherwise, it must clearly throw in the relevant check function.

Edit: I am very much against any kind of automatic exclusion based on what we actively consider mostly bad built-in data.

olemke commented 3 years ago

Is this with the flag for bad partition functions active? If so, then this is acceptable behavior. Otherwise, it must clearly throw in the relevant check function.

The flag for bad partition functions was not active.

Edit: I am very much against any kind of automatic exclusion based on what we actively consider mostly bad built-in data.

But ARTS not being able to do a simple CO2 absorption calculation out of the box cannot be the user experience we're aiming for.

How would I go about to set up a "proper" CO2 calculation with "correct" partition functions?

riclarsson commented 3 years ago

The flag for bad partition functions was not active.

OK, the test should definitely catch this. That's a bug here.

How would I go about to set up a "proper" CO2 calculation with "correct" partition functions?

Compile TIPS. All of it, including the species not supported by ARTS or HITRAN. Map those species back to ARTS automatically or throw if the species does not exist in the compiled TIPS data when it is accessed.

The limitations are great

1) This compilation has to be done from one or more data files. Otherwise we cannot reproduce old results reliably. So you have to distribute the data. The data is MIT licensed.

2) ARTS only has runtime species. You would effectively need two separate species lists in ARTS to do this, one compile time TIPS species list and another runtime ARTS species list. Or you would have to fix the species in ARTS. I only think you should consider the second option because I cannot see it being maintainable to go on with this weird runtime thing that breaks over standard changes if there's also a compile time list on the side.

3) You still need to support the legacy way that are the builtin functions of today.

Most of this could be compile time options in cmake. "Select version of TIPS" is -DSELECT_TIPS=PATH/TO/DATA and "use the old code" is -DENABLE_LEGACY_PARTITION_FUNCTIONS=1. The TIPS-style library would have to throw a user-error if there's no partition functions for the species they select.

Edit: Or update tips.xml in arts-xml-data manually. Keep a copy of the old though, since it might have been used in publications so it is not allowed to change. Keep doing this every time you find a species you need to add to ARTS.

riclarsson commented 3 years ago

@olemke I now have a basic prototype for how to do partition functions from data files to compile-time workings. It is a bit ugly currently, but I think these aesthetics can be fixed. I will describe how you get to a point where the partition functions work from compile-time below before listing what I think would need to change in addition to this for it to be useful in ARTS.

You use the prototype as follows: 1) The python file goes into its own directory with the QTpy/ folder of data from hitran.org's TIPS tab. It just fills an arts-data/ folder with TIPS data in the format I require. (This is just to generate data I can understand.) Run the file and fill up arts-data/ with partition function data per species. 2) Build the gen_auto_partfun.cc file. Put first all the C++ files in ARTS src/. Add the CMake-diff text to the src/ CMake file. (You now have to use the layout I have for pugixml in https://github.com/riclarsson/radctrl/tree/master/3rdparty/pugixml, including the "add_subdirectory(pugixml)" in ARTS 3rdparty/ because I could not figure out how to use ARTS XML IO before artscore is built. This should be fixed ahead of shipping so the same XML IO is used consistently.) Run the generated binary with the path to the arts-data/ folder from the python file and output the standard stream output into an auto_partfun.h file in build/src/. 3) Include partfun.h in any ARTS file that requires the partition functions, it includes auto_partfun.h. Use with any IsotopeRecord you create. It will give you a compile time value or throw at runtime, depending on how you use it.

File with files for inside ARTS: partfun.zip

File with file for outside ARTS (A simple modification of the HITRAN file): TIPS_2017_v1p0_mod.zip

For this prototype to be useful though, It requires some things to change in ARTS: 1) Distribute the valid data of partition_function_data.cc as XML-files together with ARTS. By default, build link to this. If there's an ARTS_XML_DATA path defined, use the TIPS data from there instead. (I think the script above generates some errors since ARTS is not using the same naming convention as HITRAN, but it is basically the same data.) 2) All species data should be replaced with the new IsotopeRecord-format. Mapping to this format should happen via functions that understands this format somehow. 3) All classes that deals with species somehow (AbsorptionLines, SpeciesTag, QuantumIdentifier, and so on) should use either IsotopeRecord or Species::Species.

I think it is a fairly deep change for the ARTS core, but that not much should change for users as the internal species representation is never supposed to be seen by anyone.

Maintainability is also a lot cleaner, as adding a new IsotopeRecord changes nothing in running/compiling code, and adding a new Species can be ensured to work with either the existing compile time warnings or by a simple static_assert.

Also, today adding a new species means modifying an already difficult file. Like this, it would mean just adding one more file to a tree of others. Not only is this more consistent with how we do line-absorption, it's just a lot cleaner. New stuff is new, not modified old stuff.

olemke commented 3 years ago

Thanks for putting this together. I'll try to set it up according to your instructions and have a look at the code. Regarding the separation of the XML routines, it's (almost) impossible to separate them out completely from artscore. The XML routines interact with all the classes defined all over the sources of artscore. Even when putting them into a separate library, it would require a link time dependency on artscore to use them. Only a subset of types independent from the artscore such as the ones from matpack and maybe GriddedField could be provided in a standalone library.

We should discuss this substantial refactoring of the partition data in the next ARTS meeting.

riclarsson commented 3 years ago

About XML-routines. I think if only Vector and Matrix were available, that would be enough here.

Den ons 19 maj 2021 kl 15:09 skrev Oliver Lemke @.***>:

Thanks for putting this together. I'll try to set it up according to your instructions and have a look at the code. Regarding the separation of the XML routines, it's (almost) impossible to separate them out completely from artscore. The XML routines interact with all the classes defined all over the sources of artscore. Even when putting them into a separate library, it would require a link time dependency on artscore to use them. Only a subset of types independent from the artscore such as the ones from matpack and maybe GriddedField could be provided in a standalone library.

We should discuss this substantial refactoring of the partition data in the next ARTS meeting.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/atmtools/arts/issues/350#issuecomment-844090860, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2XQOAJR35VNYNQMENHQITTOO2BTANCNFSM44RIOYPA .

riclarsson commented 3 years ago

354 Closes this