nzhagen / jcamp

A set of Python utilities for reading JCAMP-DX files.
MIT License
54 stars 31 forks source link

Issue with values containing "E" (scientific notation) #15

Closed mbtns closed 4 years ago

mbtns commented 4 years ago

Hi,

for raman spectral files containing 1.071025E7 the package returns 2 values: 1.030809 and 57.0 Any workaround or fix for this?

Thanks in advance.

Wkr Maarten

mbtns commented 4 years ago

Wrote some code to pre-process the data and remove the scientific notation before passing it to the jcamp module

def undo_science_float(line):
    """Remove scientifc number notations"""
    split_line = line.split(" ")
    for item in split_line:
        if ('E' in item) or ('e' in item):
            split_item = item.lower().split('e')
            new_item = float(split_item[0]) * 10 ** float(split_item[1])
            item_index = split_line.index(item)
            split_line[item_index] = new_item
        else:
            continue
    return " ".join([str(element) for element in split_line])
def preprocess_ondax_jcamp(data_folder, file_to_open):
    """
    Check if there is scientific notation present and return
    filename of cleaned file that was exported
    """
    data_folder = data_folder
    filename = data_folder / file_to_open

    with open(filename) as f:
        lines = [line.rstrip() for line in f]
    for line in lines:
        if (('E' in line) or ('e' in line)) and (line[:1].isdigit() or line[:1] == '-'):
            new_line = undo_science_float(line)
            line_index = lines.index(line)
            lines[line_index] = new_line
        else:
            continue
            # Define new path for old filename and store intermediate file
    data_folder = pathlib.Path('./data/interim')
    filename = data_folder / file_to_open
    with open(filename, 'w') as f:
        f.write('\n'.join(str(line) for line in lines))
    return filename