equinor / dlisio

Python library for working with the well log formats Digital Log Interchange Standard (DLIS V1) and Log Information Standard (LIS79)
https://dlisio.readthedocs.io/en/latest/
Other
121 stars 39 forks source link

Extracting metadata (parameter) information from LIS files. #404

Closed AchyuthB closed 2 years ago

AchyuthB commented 2 years ago

Hi, For DLIS files, I was able to extract the parameter information in a structured format quit easily by doing logical_file.parameters. For LIS, it looks like the parameter information is extracted by calling dlisio.lis.InformationRecord() . But when I do that what I get in return is an unstructured data, which I am not able to understand/process further. Is there an easy way to extract the metadata information from LIS files? It would be much easier if could get the output in a strucured table format, or as a key value pair.

Sample code

def extract_information_record_data(logical_file):
    for information_record in logical_file.wellsite_data():
        print("wellsite_data --- " + information_record.table_name() + " ----")
        print(information_record.table(simple=True))

    for information_record in logical_file.tool_string_info():
        print("tool_string_info --- " + information_record.table_name() + " ----")
        print(information_record.table(simple=True))

    for information_record in logical_file.job_identification():
        print("job_identification --- " + information_record.table_name() + " ----")
        print(information_record.table(simple=True))

Output:- I am not able to figure out how to format this data into say... Name, Description and Unit.

wellsite_data --- TFMT ----
[('CONS', '4   ', '4   ', '4   ', '80  ', None, None, None, None, None, None)
 ('BIT ', '4   ', '4   ', '8   ', '4   ', '8   ', '8   ', '4   ', None, None, None)
 ('CSG ', '4   ', '8   ', '4   ', '8   ', '4   ', '8   ', '8   ', '8   ', '4   ', None)
 ('CEMD', '4   ', '12  ', '8   ', '8   ', '8   ', '8   ', '12  ', '8   ', '8   ', '16  ')
 ('PERF', '4   ', '20  ', '12  ', '8   ', '8   ', '8   ', '8   ', '8   ', None, None)
 ('TINF', '4   ', '4   ', '12  ', '20  ', '20  ', '20  ', None, None, None, None)]
wellsite_data --- CONS ----
[('HIDE', '  ', '  ', 'MICRO LATEROLOG ')
 ('HID1', None, None, 'DUAL LATEROLOG')
 ('HID2', None, None, 'DIGITAL ARRAY ACOUSTILOG')
 ('HID3', None, None, 'GAMMA RAY LOG ')
 ('HID4', None, None, 'CALIPER LOG ')
 ...
 [remaining data truncated by me]
]
wellsite_data --- BIT  ----
[(1, 1, 17, 'IN', 5157, 6776, 'FT')]
wellsite_data --- CSG  ----
[(1, 20, 'IN', 84, 'LB/F', '  ', 0, 5157, 'FT')]
wellsite_data --- CEMD ----
[]
wellsite_data --- PERF ----
[]
wellsite_data --- TINF ----
[(1, 1, 'SWIVEL', '3944XB', '066789', '- ')
 (1, 1, 'TTRm', '3981XA', 180398, '- ')
 (1, 1, 'COM. REMOTE ', '3510XA', 179238, '- ')
 (1, 1, 'DSL ', '1329XA', 116565, 'DECENTRALIZED ')
 ...
 [remaining data truncated by me]
]
wellsite_data --- DIM  ----
[('1   ', 'MGN ', '1   ', '8   ', '12  ', 'us      ', '-9999   ', '-9999   ')
 ('2   ', 'MST ', '1   ', '8   ', '12  ', 'us      ', '-9999   ', '-9999   ')
 ('3   ', 'MWV ', '1   ', '500 ', '12  ', 'us      ', '-9999   ', '-9999   ')
 ('4   ', 'MWV ', '2   ', '8   ', '1   ', '        ', '-9999   ', '-9999   ')]
wellsite_data --- CIFO ----
[('1   ', 'DT  ', '    ', 'DT24QI  ', -83.0, 'dt24', 'i ')
 ('2   ', 'S021', '    ', 'SFA21QI ', -86.0, 'sfa21 ', 'i ')
 ('3   ', 'S022', '    ', 'SFA22QI ', -86.5, 'sfa22 ', 'i ')
  ...
 [remaining data truncated by me]
]
ErlendHaa commented 2 years ago

I think you want the components as is not formatted as a table. They are way more primitive than dlis parameters tho

AchyuthB commented 2 years ago

Hi @ErlendHaa, Extracting details from components returned 6800 rows for each logical file. Most of which looks like repeated data for me. Mnemonics 'PUNI' and 'TUNI' are repeated for around 750 times. Are there any documentation available explaining this structure?

image

Code

def extract_information_record_data(logical_file_index, logical_file):
    global g_information_record_index
    g_information_record_index = 0

    for information_record in logical_file.wellsite_data():
        extract_data(logical_file_index, information_record)

    for information_record in logical_file.tool_string_info():
        extract_data(logical_file_index, information_record)

    for information_record in logical_file.job_identification():
        extract_data(logical_file_index, information_record)

def extract_data(logical_file_index, information_record):
    global g_information_record_data
    global g_information_record_index

    for component in information_record.components():
        mnemonic = component.mnemonic
        value = component.component
        units = component.units
        component_type = component.type_nb
        component_size = component.size

        g_information_record_data = g_information_record_data.append({
            'LOGICAL_FILE_INDEX': logical_file_index,
            'INFORMATION_RECORD_INDEX': g_information_record_index,
            'MNEMONIC': mnemonic,
            'VALUE': value,
            'UNITS': units,
            'COMPONENT_TYPE': component_type,
            'COMPONENT_SIZE': component_size
        }, 
        ignore_index=True)

        g_information_record_index = g_information_record_index + 1
ErlendHaa commented 2 years ago

Hi,

When the information record is structured (tm), the purpose of the mnemonic field is to indicate which column in the structured table the component block belongs in. Hence the repeated mnemonics. This entire structuring of components is a bit wierd, and the specification doesn't offer any good explanation of the semantic meaning of it. If you want to have a closer look it's defined in chapter 3.3.1.2 in the spec

AchyuthB commented 2 years ago

Hi @ErlendHaa, I read through the chapter 3.3.1.2. Now I guess I understand how the data is structured. image

Thanks Again, Achyuth

ErlendHaa commented 2 years ago

That's right. This is also a good illustration of them, as it shows how multiple Component Blocks are combined into one row in the table. And I guess one row is sorta analogues to a DLIS parameter.

We might want to update the documentation on this a bit. Maybe even add some support features that extract each "row" as an object (dict?):

param = {
    'MNEM' : 'FBSTB',
    'STAT' : 'ALLO',
    'HEIG' : 0,
    'VOLU' : 2.125,
    'WEIG' : 468.1,
    'PRES' : 0,
    'TEMP' : 350,
}
AchyuthB commented 2 years ago

For me it looks like, LIS has more detailed parameter information when compared to DLIS. For DLIS every logical file has hundreds of parameter information, and the parameter_values are arrays (below screenshot). For LIS I feel it is more specific, like for each block and sub-block (though I don't really understand what a block is).

image

AchyuthB commented 2 years ago

Hi,

This can be closed. I was able to successfully extract the information record data .

-Achyuth