ratal / mdfreader

Read Measurement Data Format (MDF) versions 3.x and 4.x file formats in python
Other
169 stars 74 forks source link

Error reading and writing an MDF4 file #130

Closed parkertomatoes closed 6 years ago

parkertomatoes commented 6 years ago

Pyhton version

3.6.0

Platform information

Windows 10 Enterprise

Numpy version

1.14.0

mdfreader version

2.7.4

Description

Using an MDF 4.10 file which I can't share, made with AVL Concerto, write() fails:

>>> import mdfreader
>>> yop = mdfreader.mdf(r"MY_FILE.mf4")
>>> yop.write('test.mf4')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\H215374\AppData\Local\Programs\Python\Python36-32\lib\site-packages\mdfreader\mdfreader.py", line 432, in write
    self.write4(fileName=fileName)
  File "C:\Users\H215374\AppData\Local\Programs\Python\Python36-32\lib\site-packages\mdfreader\mdf4reader.py", line 1567, in write4
    temp['cn_val_range_min'] = npmin(data)
  File "C:\Users\H215374\AppData\Local\Programs\Python\Python36-32\lib\site-packages\numpy\core\fromnumeric.py", line 2420, in amin
    out=out, **kwargs)
  File "C:\Users\H215374\AppData\Local\Programs\Python\Python36-32\lib\site-packages\numpy\core\_methods.py", line 29, in _amin
    return umr_minimum(a, axis, None, out, keepdims)
TypeError: cannot perform reduce with flexible type
ratal commented 6 years ago

Hi Thanks for this report. Temporarily, you could comment this part and put 0 instead of this min and max calculation. Weird, it seems you are trying to perform min or max with string data but normally issubdtype(data.dtype, numpy_number) should avoid this. Could you print the data type when this error is thrown ?

parkertomatoes commented 6 years ago

On this line:

    temp['cn_val_range_min'] = npmin(data)

"data" is an ndarray with shape (155,) and dtype "<U1"

EDIT: I'm not 100% certain, but I think this is channel in MDF Validator. The datatype is "STRING_SBC": image

ratal commented 6 years ago

I tried the following with python 3.5.3, numpy 1.14 from numpy import array, issubdtype, number temp = array([u'kskdfj',u'jjhj']) issubdtype(temps.dtype, number) output false, I cannot reproduce. Are you sure this is correct channel ? Maybe you can put npmin an npmax in a try and in except print dtype of data and channel name, like this we are sure.

parkertomatoes commented 6 years ago

I changed the lines to this:

                    try:
                      temp['cn_val_range_min'] = npmin(data)
                      temp['cn_val_range_max'] = npmax(data)
                    except Exception as e:
                      from numpy import issubdtype, number
                      print(f"DATATYPE: '{data.dtype}'")
                      print(f"ISSUBDTYPE: '{issubdtype(data.dtype, number)}'")
                      raise

And it printed this output before the stack trace:

DATATYPE: '<U1'
ISSUBDTYPE: 'False'

I'm going over your original comment... To clarify I'm not setting this data myself, this is the unmodified data from the MDF4 file I loaded with mdfreader.mdf(r"MY_FILE.mf4"). The error occurs during write().

ratal commented 6 years ago

Sorry, I just realised you are using version 2.7.4 and I am currently on 2.75. This section has been improved by

                    if issubdtype(data.dtype, numpy_number):  # is numeric
                        temp['cn_val_range_min'] = npmin(data)
                        temp['cn_val_range_max'] = npmax(data)
                    else:
                        temp['cn_val_range_min'] = 0
                        temp['cn_val_range_max'] = 0

You can give a try

parkertomatoes commented 6 years ago

Ok, I uninstalled the pip version and ran setup.py from the git main branch of the github repo.

It appears to get past that error (progress!), but now I get a new one:

Traceback (most recent call last):
  File "write.py", line 3, in <module>
    yop.write('test.mf4')
  File "C:\Users\H215374\AppData\Local\Programs\Python\Python36-32\lib\site-packages\mdfreader-2.7.5-py3.6-win32.egg\mdfreader\mdfreader.py", line 431, in write
    self.write4(fileName=fileName)
  File "C:\Users\H215374\AppData\Local\Programs\Python\Python36-32\lib\site-packages\mdfreader-2.7.5-py3.6-win32.egg\mdfreader\mdf4reader.py", line 1665, in write4
    blocks[nchannel]['CN'] = 0  # last CN link is null
KeyError: 1207

When I debug it, blocks has a mixture of string keys and numerical keys. The numerical keys go up to 1206.

ratal commented 6 years ago

This is a bug. Your last channel might be an invalid bit channel or contains None or empty. Maybe the following could do it: blocks[next(reversed(blocks))]['CN'] = 0

parkertomatoes commented 6 years ago

Almost! Now I get this:

Traceback (most recent call last):
  File "write.py", line 3, in <module>
    yop.write('test.mf4')
  File "C:\Users\H215374\AppData\Local\Programs\Python\Python36-32\lib\site-packages\mdfreader-2.7.5-py3.6-win32.egg\mdfreader\mdfreader.py", line 431, in write
    self.write4(fileName=fileName)
  File "C:\Users\H215374\AppData\Local\Programs\Python\Python36-32\lib\site-packages\mdfreader-2.7.5-py3.6-win32.egg\mdfreader\mdf4reader.py", line 1679, in write4
    block.write(fid)
  File "C:\Users\H215374\AppData\Local\Programs\Python\Python36-32\lib\site-packages\mdfreader-2.7.5-py3.6-win32.egg\mdfreader\mdfinfo4.py", line 722, in write
    self['CN'], 0, self['TX'], 0, 0, 0, self['Unit'], self['Comment'],
KeyError: 'CN'

Footnote: I think 1207 is missing because of how the blocks are created. Above the code is a big loop which enumerates over the channels. Inside the loop is this:

                    blocks[nchannel] = CNBlock()

After the loop, nchannel is incremented one more time, so blocks[nchannel] should always result in a KeyError.

ratal commented 6 years ago

You could try to put blocks[nchannel]['CN'] = 0 # initialise 'CN' key after if CN_flag: in mdf4reader.write4 just in case but I have difficulties to understand. can you put again a try: for block.write(fid) and in except print the faulty block ?

parkertomatoes commented 6 years ago

Here's what it printed

{   'Comment': 340152,
    'TX': 340088,
    'Unit': 340120,
    'block_start': 339928,
    'cn_bit_count': 32,
    'cn_bit_offset': 0,
    'cn_byte_offset': 6422,
    'cn_data_type': 4,
    'cn_flags': 16,
    'cn_sync_type': 0,
    'cn_type': 0,
    'cn_val_range_max': <private>,
    'cn_val_range_min': <private>}
parkertomatoes commented 6 years ago

The block's key was 1206, which was the highest numerical key. I believe the "last" block using the iterator was a string.

I tried replacing the "set CN for the last block to 0" to this:

            blocks[nchannel-1]['CN'] = 0

And that appears to get through the error. The resulting MDF file is a little bigger than the original (1.3mb vs 0.8mb), but MDF validator didn't find any serious issues. image Those warnings were for channels that had a STRING_UTF8 datatype and a VAL_LIMIT_OK flag

ratal commented 6 years ago

Hi, I added a variable last_channel that should avoid your issue more generally. Can you check it with last commit in master ?

ratal commented 6 years ago

2 months inactive, I guess it is fixed, closing. If still active, please reopen