hitranonline / hapi

HITRAN Application Programming Interface (HAPI)
Other
80 stars 35 forks source link

Bug in loading CO2 isotopes labelled as A&B (11&12) #9

Open yukiitohand opened 5 years ago

yukiitohand commented 5 years ago

I think the current version fails to load absorption line data of CO2 isotopes labelled as A&B (11&12) to cache. I come up with a simple patch fix for this by changing and modifying some lines in the function getRowObjectFromString. My suggesting patch is at two locations: image and image

I uploaded this fix at my forked repo: https://github.com/yukiitohand/hapi/ I am using Python3.

yukiitohand commented 5 years ago

This problem seems to be caused by an unexpected operation: fetching extra parameters for CO2 absorption line data, although no extra parameters for CO2 is stored in the current version of the HITRAN database (if I understand correctly. Therefore, fetching such parameters is unnecessary). When I fetch CO2 absorption line data with extra parameters, say,

fetch_by_ids('CO2_2000t12500',[7,8,9,10,11,12,13,14,121,15,120,122],2000,12500,
             ParameterGroups=['LineMixing','Voigt_self','Voigt_CO2','SDVoigt'])

then it still creates 'extra' component in the header file, which leads to a call of getRowObjectFromString when the data is loaded to cache. I make sure that this problem doesn't happen and already fixed when I only fetch standard parameters specified by 'par_line'. The modification I put in the last comment turned out to be insufficient, so I put another patch here, although this may not be necessary...

def getRowObjectFromString(input_string,TableName):
    # restore RowObject from string, get formats and names in TableName
    #print 'getRowObjectFromString:'
    pos = 0
    RowObject = []
    for par_name in LOCAL_TABLE_CACHE[TableName]['header']['order']:
        par_format = LOCAL_TABLE_CACHE[TableName]['header']['format'][par_name]
        regex = '^\%([0-9]+)\.?[0-9]*([dfs])$' #
        regex = FORMAT_PYTHON_REGEX
        (lng,trail,lngpnt,ty) = re.search(regex,par_format).groups()
        lng = int(lng)
        par_value = input_string[pos:(pos+lng)]
        if ty=='d': # integer value
           # modified fix: only affects 'local_iso_id'
           ###########################################
           if par_name.lower() == 'local_iso_id':
               if par_value.strip().isnumeric():
                   par_value = int(par_value)
                   if par_value == 0:
                       par_value = 10
               else:
                   par_value = 11+ord(par_value)-ord('A')
           else:
               try:
                   par_value = int(par_value)
               except:
                   par_value = 0
           ###########################################
        elif ty.lower() in set(['e','f']): # float value
           par_value = float(par_value)
        elif ty=='s': # string value
           pass # don't strip string value
        else:
           print('err1')
           raise Exception('Format \"%s\" is unknown' % par_format)
        RowObject.append((par_name,par_value,par_format))
        pos += lng
    # Do the same but now for extra (comma-separated) parameters
    if 'extra' in set(LOCAL_TABLE_CACHE[TableName]['header']):
        csv_chunks = input_string.split(LOCAL_TABLE_CACHE[TableName]['header'].\
                                        get('extra_separator',','))
        # Disregard the first "column-fixed" container if it presents:
        if LOCAL_TABLE_CACHE[TableName]['header'].get('order',[]):
            pos = 1
        else:
            pos = 0
        for par_name in LOCAL_TABLE_CACHE[TableName]['header']['extra']:
            par_format = LOCAL_TABLE_CACHE[TableName]['header']['extra_format'][par_name]
            regex = '^\%([0-9]+)\.?[0-9]*([dfs])$' #
            regex = FORMAT_PYTHON_REGEX
            (lng,trail,lngpnt,ty) = re.search(regex,par_format).groups()
            lng = int(lng)
            par_value = csv_chunks[pos]
            if ty=='d': # integer value
                try:
                    # modified fix: I am not sure this is necessary
                    ###########################################
                    if par_name.lower() == 'local_iso_id':
                       if par_value.strip().isnumeric():
                           par_value = int(par_value)
                           if par_value == 0:
                               par_value = 10
                       else:
                           par_value = 11+ord(par_value)-ord('A')
                    else:
                       par_value = int(par_value)
                    ###########################################
                except:
                    par_value = 0
            elif ty.lower() in set(['e','f']): # float value
                try:
                    par_value = float(par_value)
                except:
                    par_value = 0.0
            elif ty=='s': # string value
                pass # don't strip string value
            else:
                print('err')
                raise Exception('Format \"%s\" is unknown' % par_format)
            RowObject.append((par_name,par_value,par_format))
            pos += 1   
    return RowObject