ratal / mdfreader

Read Measurement Data Format (MDF) versions 3.x and 4.x file formats in python
Other
169 stars 73 forks source link

Improved concatination in mdf.return_pandas_dataframe() #211

Open LaurentBlanc73 opened 1 month ago

LaurentBlanc73 commented 1 month ago

Python version

3.9.19

Numpy version

1.26.1

mdfreader version

4.1

Description

Got a lot of times the Info: highly fragmented dataframe on one call of the following line mdf.return_pandas_dataframe(master)

Resolution

I changed the function in the mdfreader.py to the following:

            channel_dict = {key: None for key in self.masterChannelList[master_channel_name]}
            for key, value in channel_dict.items():
                data = self.get_channel_data(key)
                if data.dtype.byteorder not in ['=', '|']:
                    data = data.byteswap().newbyteorder()
                if data.ndim == 1 and data.shape[0] == temporary_dataframe.shape[0] \
                        and not data.dtype.char == 'V':
                    value = data
                    #temporary_dataframe[channel] = data # original line
            temporary_dataframe = pd.DataFrame(data=channel_dict, index=temporary_dataframe.index) # added line
            return temporary_dataframe

Therefore, the dataframe does not get expanded every time but only once at the end.

ratal commented 1 month ago

Thanks @LaurentBlanc73 for your interest. I am not sure I can visualise the change actually.. Could you reformat or if you are already confident of your proposal, submit a Pull Request with this ticket ?

LaurentBlanc73 commented 1 month ago

@ratal sorry for the poor formatting, I just improved that (actually I changed the code again a little bit by pre-allocating storage).

I updated the changes in my fork and will create a pull request as soon as the previous on in #212 is passed :)