ratal / mdfreader

Read Measurement Data Format (MDF) versions 3.x and 4.x file formats in python
Other
169 stars 74 forks source link

ValueError if channel name contains '.' character #109

Closed ginseil closed 6 years ago

ginseil commented 6 years ago

Hi

I have the following error if I call mdf.convertToPandas() for a MDF 3 file that has channels with '.' character in their names:

File "..\format\mdf\converter.py", line 162, in _load_mdf_data mdf.convertToPandas() File "..\lib\site-packages\mdfreader-2.7.1-py3.6-win-amd64.egg\mdfreader\mdfreader.py", line 1289, in convertToPandas time = datetimeInfo + array(self.getChannelData(group) * 1E6, File "..\lib\site-packages\mdfreader-2.7.1-py3.6-win-amd64.egg\mdfreader\mdfreader.py", line 460, in getChannelData return self._getChannelData3(channelName) File "..\lib\site-packages\mdfreader-2.7.1-py3.6-win-amd64.egg\mdfreader\mdf3reader.py", line 949, in _getChannelData3 self.read3(fileName=None, info=self.info, channelList=[channelName], convertAfterRead=False) File "..\lib\site-packages\mdfreader-2.7.1-py3.6-win-amd64.egg\mdfreader\mdf3reader.py", line 895, in read3 temp = buf[recordID]['data'][recordName] File "..\lib\site-packages\numpy\core\records.py", line 499, in getitem obj = super(recarray, self).getitem(indx) ValueError: no field of name FR1_Motor_20_Ch_A_MO_Fahrpedalrohwert_01

Note: FR1_Motor_20_Ch_A.MO_Fahrpedalrohwert_01 is the real name of the affected channel. FR1_Motor_20_Ch_A_MO_Fahrpedalrohwert_01 is used as 'recordName' mdf3reader in line 895

=> '.' has been replaced with '_'

ratal commented 6 years ago

To clarify, are you reading with argument noDataLoading = True ? If yes, what happens if you read with false ? For info, recarray cannot allow character '.' as identifier -> recordName is original name cleaned up by _convertName in mdf.py. This channel is I guess a time channel. To narrow down more easily, it is unsorted data ? buf[recordID]['data'] should have record names that are generated from channel.recAttributeName so I do not understand for the moment what is wrong. can you print the recarray buf[recordID]['data'] to know what name is expected or if it is missing ?

danielhrisca commented 6 years ago

numpy 1.13.1 supports names that contain '.' :

(numpy.record, [('t', '<f8'), ('VS650_1.TH01', '<i4'), ('invalidation_bytes', 'S')])
ratal commented 6 years ago

You can create it, but if you want to access the data afterwards, you end up with issues like in #58 Numpy records is not checking name compatibility at creation and this issues appears only when you want to access your data based on the record name like with a getattribute(). Example, you are supposed to get channel with recarray.channel_name; if there is a '.' in the name, recarray.channel.name will not work. This is annoying, because you have to create double naming just to be compatible with that and code becomes more complex and prone to bugs + perfomance issue when cleaning many names from forbidden characters iteratively. Some kind of recarray could be implemented alternatively, like making a dictionnary of name: index in array. Or use list or dictionnary of vectors/arrays but probably less efficient. Anyway, big change. Or reporting to numpy developpers to improve situation (should do or at least ask for advice) ; however it would force compatibility from newer numpy version, maybe easier and most interesting for long term. You might have a better idea ?

danielhrisca commented 6 years ago

I can't say when this was fixed but for me both '.' and '#' work, both with asammdf and mdfreader

from __future__ import print_function
from asammdf import MDF, Signal
from mdfreader import mdf
import numpy as np

f = MDF()

s1 = Signal(
    samples=np.array([1,4,9,16]),
    timestamps=np.array([0.1, 0.2, 0.3, 0.4]),
    name='Signal.With.Dots',
    unit='dots',
)

s2 = Signal(
    samples=np.array([1,4,9,16]),
    timestamps=np.array([0.1, 0.2, 0.3, 0.4]),
    name='#Signal.With.#',
    unit='#',
)

f.append([s1, s2])

f.save('out.mf4', overwrite=True)

f = MDF('out.mf4')

f.get('Signal.With.Dots').plot()
f.get('#Signal.With.#').plot()

print(f.groups[0]['record'].dtype)

mdf('out.mf4').plot('Signal.With.Dots')
mdf('out.mf4').plot('#Signal.With.#')

and this prints:

(numpy.record, [('t', '<f8'), ('Signal.With.Dots', '<i4'), ('#Signal.With.#', '<i4'), ('invalidation_bytes', 'S')])
danielhrisca commented 6 years ago

It even work with non-ASCII characters

from __future__ import print_function
from asammdf import MDF, Signal
from mdfreader import mdf
import numpy as np

f = MDF()

s1 = Signal(
    samples=np.array([1,4,9,16]),
    timestamps=np.array([0.1, 0.2, 0.3, 0.4]),
    name='Signal.With.Dots',
    unit='dots',
)

s2 = Signal(
    samples=np.array([1,4,9,16]),
    timestamps=np.array([0.1, 0.2, 0.3, 0.4]),
    name='# ひらがな . чу́дно - العَرَبِيَّة',
    unit='#',
)

f.append([s1, s2])
f.save('out.mf4', overwrite=True)

f = MDF('out.mf4')
f.get('Signal.With.Dots').plot()
f.get('# ひらがな . чу́дно - العَرَبِيَّة').plot()

print(f.groups[0]['record'].dtype)

mdf('out.mf4').plot('Signal.With.Dots')
mdf('out.mf4').plot('# ひらがな . чу́дно - العَرَبِيَّة')
(numpy.record, [('t', '<f8'), ('Signal.With.Dots', '<i4'), ('# ひらがな . чу́дно - العَرَبِيَّة', '<i4'), ('invalidation_bytes', 'S')])
ratal commented 6 years ago

print is ok but can you try to expose its array accessing it by name ?:

f.groups[0]['record'].__getattribute__('# ひらがな')
f.groups[0]['record']['# ひらがな']
danielhrisca commented 6 years ago

If you run the script and see the plots then recarray access is working.

Like I said, this works for mdfreader as well with numpy 1.13.1

ratal commented 6 years ago

Thanks Daniel, I have an error for ascii character at f.append([s1, s2]) but does not matter, I get your point. Indeed, I tried also to make basic code and it works. Of course not for recarray.channelname but getattribute and recarray[channelname] are ok. I do not know either since which numpy version this was fixed, but I will deactivate the recarray name sanitization --> that might solve your issue ginseil, you can check last commit.

ginseil commented 6 years ago

Thank you guys, this is fixed for me now :)