Closed ajaysgowda closed 3 years ago
Hi, dataInPandas = data.convertToPandas() is not correct. This convertToPandas method converts the object data into pandas frames and does not return a pandas object
Thanks for the quick response. Im new to this so please bear with me. I have a .dat file from ETAS INCA and i would like to convert it to a usable format in python for data processing. how do you suggest i go about it? thanks again
Hi, pandas is good option for processing data in python. Depends of what kind of processing you want to do but most used and common file format is hdf5. ldf class as is gives you already access to your data directly, you can follow the readme file for examples
thanks. i am trying to convert the mdfreader.mdf type to a dataframe type so that i can access the data. how do i do that?
You can already access the data using mdf.getChannelData(channelName). That returns a numpy vector.
my file is large and im having issues figuring out the rater at which it is recorded. i guess i need to do more digging. thanks.
To figure out the exact raster, you need to do a bit of juggling. You need to use mdf.getChannelData(mdf.getChannelMaster(channelName))
to get the master time vector for a given channel, then do a simple division of the second and first values, round that off to the nearest hundredth, and that should give you the measure rate. But i don't think that's what you want to do.
If you must sort the channels by measure rate, you can simply use the name of the master channel as rasters. Alternatively, you could resample the file to a known measure rate, and everything will be aligned. I think this latter option is probably what you need.
If your measure files are large, you can load only certain channels (assuming you don't need all of them). I think the option is called channelList
, and you can use it when loading the file. You can also use noDataLoading=True
so only the channel names are loaded.
this works perfectly thanks.
How do i name the master channels as the rater? is there a parameter i can feed when i call mdfreader.mdf().
I need one last issue resolved for me to be able to move completely from matlab to Python. The data file i have has multiple channels with the same name coming from different sources. Usually the channel name has \'sourceName'
appended to the end of it. It did in matlab atleast. is there a way i can make sure that part of the file name is reflected when read MDF data from this library.
I appreciate your help. Thanks
I think you can rename the master channels. I'm not sure, because i never needed to do it. It might break things, though, unless you make sure to update all slave channels with the new name.
When detecting multiple channels with the same name, the module appends _1, _2
to the name, i think. To find out the raster, you can get the master channel name. But what do you mean by sourceName
? Are you using multiple devices to record data?
i think i can work around the whole raster stuff. Yes. I'm recording from multiple devices at the same time.
I'm not sure how Inca deals with that. If Inca appends the source name to the channel name, it should still be there. If not, mdf will assign different terminations to duplicate channels. I think the best thing to do is just to try it and see what happens.
import mdfreader as mdf
datFile = mdf.mdf('file.dat')
for channel in datFile:
print(channel)
This is what i get. looks like it recognises the different sources. But just adds a number to the end of the channel name.
time
'channelName'
time_1
'channelName'_1
time_2
'channelName'_2
I think this might be something for Aymeric to look into. See if multi-device support is possible.
Thanks. From what i can see from matlab, the source device in the Long name in matlab but not the description. Not sure if that helps , bit thought it would be informative to mention.
Hi, To rename channel, there should be a method called rename_channel() available. Indeed, when there is duplicate channel name in the same file, it appends the data group number to it to make it unique. If ithis would not exist and because the file content (all channels) is flattened (mdf file content is more structured like a tree), you would write different data in the same 'container' or dict key = channel name, overwriting them, loosing data, which one do not want. There has been long thinking on what is best mdf object definition for python and this flattened way seemed the most simple, easiest to access/analyse data and close to bare python but it brings then some complications like this one. This number appending happens in mdfinfo3&4, info class. For instance mdfinfo4, in readCNBlock, around line 1578 or 1581, so you could change the behaviour relatively easily and append source instead. around line 225 and 228 for mdfino3. However, I would not advise it because:
Correction, rename_channel as currently implemented will indeed will break things if you want to change master names (master key in channel dict will not be corresponding anymore, breaking its link). I will improve it. By the way, how is it an issue to have number instead of source name ? I guess it is because you expect its name to understand to what is corresponding to duplicated channelName ?
It can get confusing having channel_1 and master_2
Yes, I can understand, but I think I do not have better idea/solution for the moment...
You would have to drop the dict like behavior
uhmm, implications will be very big for the code at this point. Plus I am not sure it will be more easy to use or understand file content. I know you opted for object instead and it brings also other advantages but I am still personnally in favour of simple basic python objects. Maybe you can give your opinion on that, comparatively ?
the some channel names from various sources are the same.
When i have the same channel names which are numbered eg. channel_1, channel_2..... this results in me not knowing which channel name corresponds to which source. From MDFimport in matlab i can see that the longSignalName has the source info appended to the channel name but the signalDescription doesnt have that info. not sure if that helps.
Maybe there doesn't need to be a big change. I imagine source data might be somewhere available in the MDF file. When duplicate channels are found, can we try and look for a source name and use that as an append instead. If no source can be found, we fall back to the current behaviour.
I just wanted to say that it's impossible to manage name conflicts and still have dict behavior. I'm sure a code change in this direction would mean a lot of rework.
Having objects (I think you mean dedicated class for each block type) makes the internal structure similar to the one of the file on disk. It's sometimes easier to debug errors.
Ok. Good idea Cristi, seems good compromise, I will try that. I guess priority for mdf 3.x ?
I just introduced it, might be a bit buggy but you can try it on both mdf 3.x and 4.x
unfortunately it didn't seem to work. i still get the same results as before. :(. Thanks for trying to accommodate my request
Did you install the master last commit in github ? Can you open your file with MDFValidator and check if there are extension blocks for the channels you are interested in ? If there is not, current implementation will fall back to previous behaviour. By the way, you could comment the 4 lines from line 215, it is stripping the device name from the end of signal name, that is probably what you are looking for in the end. Next 2 lines coudl also be commented, splitting with '.'
i did just that. it works great now. Commenting out those lines works perfectly. thanks!! just out of curiosity, is the reason why you have the split in line 215?
I thought this additionnal devices names were annoying info and generally not allowed characters for other file formats, so I removed it. But it brings uniqueness of name. Maybe I should make optional or better document it.
Hello,
i'm facing exactly the same problem. How did you solve it? Which modules did you use and how did you import them? I'm very new to this. This is my very first attempt at programming. So please bear with me. Thanks Fourka
Can you detail more your issue, maybe we could support you ?
Hi Ratal,
I have a .dat file from INCA and i would like to convert it to a usable format in python to create the plots i need. I imagine something like a GUI. The user should be asked by executing the program which files should be reader. THANKS Fourka
What could be a usable format in python ? You could create a plot by parsing your .dat file in an interactive python (ipython, jupiter, ..)
yop=mdfreader.Mdf('youfile.dat') yop # will display the content of the yop object, its channels, data, description, units, etc. yop.plot('channelname')
you could also convert into pandas dataframes you object with
yop.convert_to_pandas()
mdfconverter is also part of this module which give a GUI to convert your .dat into other file formats like hdf5, Matlab, netcdf. mdfreader could also be included in Veusz advanced GUI but not that easy to setup.
If this is still too complicated with those command lines, you could use asamMdf module that has good and easy GUI
Hi Ratal, THANKS for the quick Response! yop.plot('channelname') works perfect. I want to Display multiple channels from different .dat files in one plot. Is it possible to add or to change channels in an existing plot? Thanks Fourka
.plot() can take nested list of channels as argument, so you can build your plots by grouping them, creating multiplots. But this will be applicable only for one file. If you want to compare data between files, you need to make your own script according to your needs by using .getChannelData(), getChannelUnit(), .getChannelMaster() methods. You can get inspired of how using matplotlib looking at .plot code in mdfreader.py
Thanks for your advice. What does getChannelMaster() deliver? I can't find it in the documentation on mdfreader.py. Thanks Fourka
Sorry, I forgot the '_' .get_channel_master('channelName') will give you the name of the master channel of the given channel in argument. A master channel is most generally time but could also be angle distance, etc. (since mdf4.x). with this channel name you can create you X axis.
I use get_channel_data to yield the channel as numpy array.
datFile = mdf.Mdf('FileName') datFile.mdf.get_channel_data('ChannelName') These two lines work very well. But when it comes to print(ChannelName) i get: Error Name: Name 'ChannelName' is not defined. In order to make the plots i want see at first the numpy array. Could you please help me once again. Thanks Fourka
you should do print(datFile['ChannelName']) import matplotlib.pyplot as plt plt.plot(dataFile.get_channel_data(datFile.get_channel_master('channelName')), dataFile.get_channel_data('channelName'))
THANKS A LOT it works perfect.
this works perfectly thanks.
How do i name the master channels as the rater? is there a parameter i can feed when i call mdfreader.mdf().
I need one last issue resolved for me to be able to move completely from matlab to Python. The data file i have has multiple channels with the same name coming from different sources. Usually the channel name has
\'sourceName'
appended to the end of it. It did in matlab atleast. is there a way i can make sure that part of the file name is reflected when read MDF data from this library.I appreciate your help. Thanks
Hello all, I have a problem that the channel name seems have "nmot_w\XCP","rl_w\XCP","latitude\GPS" names . Is there a way to remove them ? using asammdf ,we can use "use_display_name" to remove “\XCP” or df_columns = df.columns.tolist() df_columns_new=[c.replace("\XCP","") for c in df_columns]
is there a API way to do so ?
You could use filter_channel_names parameter but it was rather meant for removing based on '.', customising for your need (example code is in read_cn_block / mdfinfo4) Normally exports should remove this character if not allowed for the format. You could also use .rename_channel(channel_name, new_name) recursively
Pyhton version
Python 3.6
os
Windows
Numpy version
1.13.1
mdfreader version
2.7.4
Description
Hello, I'm trying to convert my inca .dat file to a pandas data-frame. The
data = mdfreader.mdf(fileLocation)
line works fine. But when I use thedataInPandas = data.convertToPandas
line, I get a method datatype that i cannot open. If I use thedataInPandas = data.convertToPandas()
line, I get a None type data command. I want to able to see the data as a data-frame variable type. Am i doing something wrong? Please help Thanks Ajay