GRIPS-code / pyLBL

Python line-by-line radiative transfer model
GNU Lesser General Public License v2.1
11 stars 4 forks source link

Use HAPI 2 for HITRAN database management. #5

Closed menzel-gfdl closed 1 year ago

menzel-gfdl commented 3 years ago

With HAPI 2 now in its alpha state, I have been playing around with implementing it to manage the line-by-line input parameters.

For example, a local copy of the database can be created as follows

from hapi2 import fetch_isotopologues, fetch_molecules, fetch_parameter_metas, \
                  fetch_transitions, init, Molecule, Transition

init()

def create_local_database(numin=0, numax=60000, line_list="line-list"):
    """Downloads HITRAN line-by-line data and creates a local SQL database.

    Args:
        numin: Wavenumber lower bound [cm-1].
        numax: Wavenumber upper bound [cm-1].
        line_list: Name for temporary files.
    """
    fetch_parameter_metas()
    molecules = fetch_molecules()
    for molecule in molecules:
        if str(molecule) in ["Chlorine Nitrate",]: continue
        try:
            isotopologues = fetch_isotopologues(molecule)
            transitions = fetch_transitions(isotopologues, numin, numax, line_list)
        except Exception as e:
            if str(e) != "Failed to retrieve data for given parameters.": raise

After the database is created, the line parameters can be queried by

def load_line_parameters(formula, numin=0, numax=60000):
    """Reads the HITRAN molecular line parameters from a local SQL database.

    Args:
        formula: String chemical formula (i.e. H2O).

    Returns:
        A list of Transition objects.
    """
    return Molecule(formula).transitions.filter(Transition.nu>=numin).filter(Transition.nu<=numax)

@RobertPincus I think it would be a good idea for us to decide what format the line-by-line parameters should be returned in, so the database management can be cleanly separated from the part of the code responsible for computing the absorption coefficients. It's also worth discussing how we control which parameters are read from the database. I believe the above code example only gives the user access to the typical 160 character HITRAN ".par" parameters.

RobertPincus commented 3 years ago

@menzel-gfdl This is terrific progress, thanks a lot. From this and your exchanges with the HITRAN people do I get it right that you think HAPI2 might be a plausible approach to the spectroscopic database?

Can you amplify what you mean by "what format the line-by-line parameters should be returned in"? What choices are available?

I'd encourage you to include @olemke and @riclarsson in the discussion - they will have more to say than I.

menzel-gfdl commented 3 years ago

@RobertPincus yes I think HAPI 2 could work for us, but in it's current form I think we are limited to the basic 160 character ".par" HITRAN line parameters (and therefore only Doppler, Lorentz, and Voigt line shapes). I believe there is work on-going which will allow users to get the rest of the parameters including first-order line-mixing coefficients where HITRAN data exists for them.

Correct me if I'm wrong, but I believe last time we spoke as a group we agreed to try to separate the database management from the actual computation of the absorption coefficients. I think this is the correct approach and will make supporting multiple absorption coefficient back-ends (i.e., ARTS, GRTcode, and possibly others) a lot more streamlined. By "what format the line-by-line parameters should be returned in", I'm referring the the interface between the database management part of the application and the computational part. For example, the load_line_parameters routine above returns a list of HAPI 2 Transition objects, where each object in the list corresponds to a single spectral line found in the database. We could use this as our interface and require each absorption coefficient back-end library to convert these objects into whatever format they require (thus making HAPI 2 a dependency for all the back-end libraries). Or as another approach, we could instead construct dask/numpy arrays of line parameters ourselves and then pass those to the back-end libraries. @olemke and @riclarsson, I'm curious of what you think would work best for ARTS. My hope is that we can come to a consensus. If we can, that should make switching back-ends very easy for the users.

olemke commented 3 years ago

From a technical perspective this looks like a good approach. If the HAPI 2 Transition objects contain at least the same parameters as the HITRAN .par format, PyARTS can then provide a function that transforms them into an ArrayOfAbsorptionLines for the ARTS calculation. It would basically be analog to the ReadHitran workspace method currently in ARTS, but taking the input from the Transition objects instead of reading from a file. I don't think it is necessary to convert the parameters into dask/numpy arrays as it seems like an additional, unnecessary layer of abstraction.

riclarsson commented 3 years ago

I agree with @olemke . With a reservation from computational speed that it would be very bad if we only get the .par-information. ARTS needs the band information for sanity checks. We need a way to fold all lines that are in the array Oliver describes into a single band object (AbsorptionLines). How well is the band information kept with these Transition objects?

menzel-gfdl commented 3 years ago

@olemke and @riclarsson , the basic HITRAN parameters can be extracted from the Transition objects as follows (see here):

lines = load_line_parameters("H2O", numin=0, numax=60000)
for line in lines:
    a = line.parse.a
    d_air = line.parse.delta_air
    gp = line.parse.gp
    gpp = line.parse.gpp
    en = line.elower
    gamma_air = line.parse.gamma_air
    gamma_self = line.parse.gamma_self
    iso = line.parse.local_iso_id
    n_air = line.parse.n_air
    Qp = line.parse.Qp
    qp = line.parse.qp
    Qpp = line.parse.Qpp
    qpp = line.parse.qpp
    s = line.sw
    v = line.nu

@riclarsson, could you please further explain what you mean by "band information"? Is this information you currently get from HITRAN? If so, where do you get it from?

erwanp commented 2 years ago

@menzel-gfdl for information how did you get access to hapi2?

menzel-gfdl commented 2 years ago

@erwanp It is a private repository, so you will have to email the HITRAN group and ask for access.

RobertPincus commented 1 year ago

@menzel-gfdl Do we want to close this issue given that we've adopted our own database management?

menzel-gfdl commented 1 year ago

@RobertPincus Yes, I am closing it now.

erwanp commented 1 year ago

@RobertPincus @menzel-gfdl just FYI we have adopted a common Database management API system in https://github.com/radis/radis and https://github.com/HajimeKawahara/exojax, which includes Hitran / Hitemp / Exomol / and more coming. Depending on your needs it might be interesting for you too!!