ratal / mdfreader

Read Measurement Data Format (MDF) versions 3.x and 4.x file formats in python
Other
169 stars 73 forks source link

Data conversion inconsistent #204

Open max3-2 opened 1 year ago

max3-2 commented 1 year ago

Python version

'3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]'

Platform information

'Windows-10-10.0.19044-SP0'

Numpy version

'1.24.1'

mdfreader version

'4.1'

Description

Loading a can file and converting channels fails if conversion is not specified as 1 in the main group but in a sub(?)-group. MDF is a CAN file with version 4.1. Other software tools read and convert the data correctly (but are closed source, e.g. MATLAB)

Example data is as follows:

File info

file_info['ID']

{'id_file': b'MDF     ',
 'id_vers': b'4.10    ',
 'id_prog': b'MDF4Lib\x00',
 'id_ver': 410,
 'id_unfi_flags': 0,
 'id_custom_unfi_flags': 0}

Loading the data with no conversion (just for having data in between the steps, the final result is the same if I apply conversion on load or subsequently). Some info is shown as xxx by me to blank out some information. So the data yields for a working signal called ACC

file_data['ACC']

{'unit': 'm/s²',
 'description': '',
 'master': 't_274_274',
 'masterType': 0,
 'data': array([32507, ..., 32342], dtype=uint16),
 'conversion': {'type': 1, 'parameters': {'cc_val': (-65.0, 0.002)}},
 'axis': [],
 'id': ((274, 0, 64),
  ('xxx', 'xxx', 'xxx'),
  ('xxx', 'xxx', 'xxx'))}

then converting ACC, the result is correct

{'unit': 'm/s²',
 'description': '',
 'master': 't_274_274',
 'masterType': 0,
 'data': array([ 0.014, ..., -0.316]),
 'axis': [],
 'id': ((274, 0, 64),
  ('xxx', 'xxx', 'xxx'),
  ('xxx', 'xxx', 'xxx'))}

for a not working signal called temp

file_data['temp']

{'unit': '°C',
 'description': '',
 'master': 't_222_222',
 'masterType': 0,
 'data': array([820, ..., 960], dtype=uint16),
 'conversion': {'type': 7,
  'parameters': {'cc_val': (16381.0, 16382.0, 16383.0),
   'cc_ref': ['Reserviert_nicht_verfuegbar',
    'Reserviert_Fehler',
    'Signal_unbefuellt',
    {'pointer': 579024,
     'id': b'##CC',
     'length': 96,
     'link_count': 4,
     'cc_tx_name': 0,
     'cc_md_unit': 578992,
     'cc_md_comment': 0,
     'cc_cc_inverse': 0,
     'cc_type': 1,
     'cc_precision': 0,
     'cc_flags': 0,
     'cc_ref_count': 0,
     'cc_val_count': 2,
     'cc_phy_range_min': 0.0,
     'cc_phy_range_max': 0.0,
     'cc_val': (-100.0, 0.1),
     'unit': {'id': b'##TX',
      'length': 32,
      'link_count': 0,
      'Comment': '°C'}}]}},
 'axis': [],
 'id': ((222, 0, 368),
  ('xxx', 'xxx', 'xxx'),
  ('xxx', 'xxx', 'xxx'))}

and then conversion for a not working signal called temp - the conversion is consumed but the data is the same. The conversion global type seems 7 but the true correct conversion that other software reads is 1 and the parameters are specified in a sub-group

{'unit': '°C',
 'description': '',
 'master': 't_222_222',
 'masterType': 0,
 'data': array([820, ..., 960], dtype=uint16),
 'axis': [],
 'id': ((222, 0, 368),
  ('xxx', 'xxx', 'xxx'),
  ('xxx', 'xxx', 'xxx'))}
ratal commented 1 year ago

Hi, There could be a bug in mdf4reader.py, function _value_to_text_conversion(). Default conversion corresponding to last reference in cc_ref is a linear conversion but does not seem to be applied.

max3-2 commented 1 year ago

Yes its consuming the pretty complex conversion setting without diving into the dict. I do not know if this is a common thing in mdf or a very specific but I guess it would be btter if the conversion is not consumed and kept with the dataset if the conversion is not valid.

ratal commented 1 year ago

Could you try with mdf4.convertTables = True ? (first initialise without file name as parameter, change default value for convertTables from False to True and then read the file with .read(file_name) )

max3-2 commented 1 year ago

Sure. Im using this:

import mdfreader
mdf4 = mdfreader.Mdf()
mdf4.convertTables
    Out[3]: False

mdf4.convertTables = True

data = mdf4.read(file)

Traceback (most recent call last):

  File "C:\Users\...\AppData\Local\Temp\ipykernel_20840\4165782594.py", line 1, in <cell line: 1>
    data = mdf4.read(f1)

  File "C:\Users\...\Documents\pyvenvs\base_dev\lib\site-packages\mdfreader\mdfreader.py", line 419, in read
    self.read4(self.fileName, None, multi_processed, channel_list,

  File "C:\Users\...\Documents\pyvenvs\base_dev\lib\site-packages\mdfreader\mdf4reader.py", line 1538, in read4
    self._convert_all_channel4()

  File "C:\Users\...\Documents\pyvenvs\base_dev\lib\site-packages\mdfreader\mdf4reader.py", line 1661, in _convert_all_channel4
    [self._convert_channel4(channelName) for channelName in self]

  File "C:\Users\...\Documents\pyvenvs\base_dev\lib\site-packages\mdfreader\mdf4reader.py", line 1661, in <listcomp>
    [self._convert_channel4(channelName) for channelName in self]

  File "C:\Users\...\Documents\pyvenvs\base_dev\lib\site-packages\mdfreader\mdf4reader.py", line 1649, in _convert_channel4
    self.set_channel_data(channel_name, self._get_channel_data4(channel_name))

  File "C:\Users\...\Documents\pyvenvs\base_dev\lib\site-packages\mdfreader\mdf4reader.py", line 1567, in _get_channel_data4
    return self._convert_channel_data4(self.get_channel(channel_name), channel_name,

  File "C:\Users\....\Documents\pyvenvs\base_dev\lib\site-packages\mdfreader\mdf4reader.py", line 1621, in _convert_channel_data4
    vector = _value_to_text_conversion(vector, conversion_parameter['cc_val'],

  File "C:\Users\...\Documents\pyvenvs\base_dev\lib\site-packages\mdfreader\mdf4reader.py", line 2437, in _value_to_text_conversion
    maxlen = max([len(ref) for ref in cc_ref])

  File "C:\Users\...\Documents\pyvenvs\base_dev\lib\site-packages\mdfreader\mdf4reader.py", line 2437, in <listcomp>
    maxlen = max([len(ref) for ref in cc_ref])

TypeError: object of type 'int' has no len()

This exception points to the nested dicts discussion as above. BTW I think #205 is the same issue. Might be changes to mdf or just a case where these nested structures are not covered - I do not have access to the mdf data structure description neither the time (sorry) to fully understand the file structure. Currently im fixing this with the follwoing to resolve the nested dict. This works fairly efficient for my use cases:

def read_and_fix(file, convert=True):
    """Reads and fixed columns
    """
    file_data = mdfreader.Mdf(file, convert_after_read=False)
    file_info = mdfreader.MdfInfo(file)

    # Fix conversions
    for ch in file_info['allChannelList']:
        if file_data[ch]['conversion']['type'] == 7:
            true_conv = file_data[ch]['conversion']['parameters']['cc_ref'][-1]
            if isinstance(true_conv, dict):
                logger.debug(f'Fixing channel {ch:s}')
                conv_type = true_conv['cc_type']
                conv_params = true_conv['cc_val']
                # ReSet
                file_data[ch]['conversion']['type'] = conv_type
                file_data[ch]['conversion']['parameters'] = {'cc_val': conv_params}
    if convert:
        file_data.convert_all_channels()

    return file_data, file_info
seanLiu716 commented 1 year ago

Cool. The fix solved my issue