danielhrisca / asammdf

Fast Python reader and editor for ASAM MDF / MF4 (Measurement Data Format) files
GNU Lesser General Public License v3.0
632 stars 224 forks source link

MDF.concatenate returns mdf with enforced UTC start_time #155

Closed MatinF closed 5 years ago

MatinF commented 5 years ago

Python 3.7.3 | asammdf 5.4.1 | MDF 4.11

After review, I've edited the question somewhat

Thanks for a fantastic tool!

I'm trying to use asammdf to concatenate multiple MDF files into a single file for analysis and DBC conversion.

Each file has a start time in the header, hd_start_time_ns and time is incremented from 0 in the timestamp channel (see attached single file examples).

To concatenate the files, I use the following code:

from asammdf import MDF

files = ['AC6013CD_00003277_00000001.mf4', 'AC6013CD_00003277_00000005.mf4']
dbc = ['CSS-Electronics-SAE-J1939-DEMO.dbc']

mdf = MDF.concatenate(files,time_from_zero=False)
mdf.save('concatenated.mf4')

mdf = mdf.extract_can_logging(dbc)
mdf.save('scaled.mf4')

Expected behavior: When running the above code, I would expect it to save new MDF files where: 1) start_time of the concatenated MDF should equal the start_time of AC6013CD_00003277_00000001 in local time 2) start_time of the scaled MDF should equal the start_time of AC6013CD_00003277_00000001 in local time

Actual behavior: 1) start_time of the concatenated MDF equals the start_time of AC6013CD_00003277_00000001 in UTC 2) start_time of the scaled MDF equals the PC time at the time of conversion

Proposed change: It would be ideal if the concatenation function would either 1) use local start_time as default if this has been set in the original MDF files, or 2) allow for an optional argument to specify that the returned MDF should use local time

Further, it seems more intuitive that the scaled MDF would also inherit the local start_time of the original MDF - or at least allow the user to set this as an optional argument. This way, end users can easily get their converted data and see when e.g. a specific signal in that MDF was logged in absolute time using to_dataframe with time_as_date=True.

Happy to hear your thoughts on this Daniel! sample_files.zip

Best, Martin

danielhrisca commented 5 years ago

Hello Martin,

please share the initial files as well

MatinF commented 5 years ago

Hi Daniel, I updated the question.

Part of it was a misread by me, so I've clarified the pieces that I'm thinking are still relevant. Also, I've attached the relevant files.

danielhrisca commented 5 years ago

Hello Martin,

I get this error while trying to plot the ID from CG1 in the second file:

image

The problem is that the first timestamp sample is image and afterwards the values jumps to 0 image

MatinF commented 5 years ago

Hi Daniel,

Thanks for the tip on CANape - I'll check with the team if there's potentially something that needs to change in regards to the timestamp. I'm guessing that error might be due to the fact that our log files do not yet have a "end_time" added in the meta data, which I think Vector's tools expect.

Regarding the GUI pictures, I'm unfortunately not able to replicate your images as I don't seem to have that view in v5.4.1 of the GUI. When I try to plot the timestamp via the graph in the GUI, it looks as expected.

Each of the log files should be structured so that there's an absolute start_time (in the header), in nanoseconds. The Timestamp master channel should then be incremented with outset in 0 so that they can be added to the absolute timestamp (based also on your previous corrective inputs on the way the timestamp was structured).

When I run mdf.get('Timestamp').samples[0] and mdf.get('Timestamp').samples[20] e.g., I get 0.0 and 0.02495 - which is what I'd expect. Perhaps the GUI shows the start_time when selecting 0.000s if the log file's timestamp at this point is 0.0?

Overall, I get my expected result when concatenating the MDFs (contrary to what I thought initially).

The only problem in regards to the concatenate function is that the start_time is offset by 2 hours in my case, since it enforces UTC, rather than the original local time. That would be great if it would not be the default - or alternatively if an argument could be parsed to keep local time.

The only real potential "bug" that I can see is that the scaled MDF gets the PC time as the start_time, which seems counter-intuitive. It might be related to our log files, but unfortunately I don't have alternative similar log files to compare with.

Hope the above makes sense - happy to elaborate on any points.

danielhrisca commented 5 years ago

If i run this

from asammdf import MDF

m = MDF(r'AC6013CD_00003277_00000005.mf4')
print( m.get_master(0)[:100])

if get

[1.84467441e+10 1.84467441e+10 1.84467441e+10 1.84467441e+10
 1.84467441e+10 1.84467441e+10 1.84467441e+10 1.84467441e+10
 1.84467441e+10 1.84467441e+10 1.84467441e+10 1.84467441e+10
 1.84467441e+10 9.50000000e-04 6.00000000e-03 1.10000000e-02
 1.60000000e-02 2.10000000e-02 2.59500000e-02 3.09500000e-02
 3.59500000e-02 4.09500000e-02 4.59500000e-02 5.09500000e-02
 5.60000000e-02 6.09500000e-02 6.60000000e-02 7.09500000e-02
 7.60000000e-02 8.09500000e-02 8.59500000e-02 9.09500000e-02
 9.59500000e-02 1.00950000e-01 1.06000000e-01 1.10950000e-01
 1.16000000e-01 1.20950000e-01 1.25950000e-01 1.30950000e-01
 1.35950000e-01 1.40950000e-01 1.46000000e-01 1.51000000e-01
 1.56050000e-01 1.61000000e-01 1.66000000e-01 1.71000000e-01
 1.76000000e-01 1.81050000e-01 1.86050000e-01 1.91050000e-01
 1.96050000e-01 2.01000000e-01 2.06000000e-01 2.11050000e-01
 2.16000000e-01 2.21000000e-01 2.26050000e-01 2.31000000e-01
 2.36050000e-01 2.41050000e-01 2.46000000e-01 2.51050000e-01
 2.56000000e-01 2.61000000e-01 2.66000000e-01 2.71050000e-01
 2.76000000e-01 2.81050000e-01 2.86000000e-01 2.91000000e-01
 2.96000000e-01 3.01050000e-01 3.06000000e-01 3.11050000e-01
 3.15950000e-01 3.21000000e-01 3.26000000e-01 3.31100000e-01
 3.36050000e-01 3.41000000e-01 3.46050000e-01 3.50950000e-01
 3.56000000e-01 3.61000000e-01 3.66000000e-01 3.71000000e-01
 3.76050000e-01 3.81050000e-01 3.86000000e-01 3.91000000e-01
 3.96000000e-01 4.01050000e-01 4.06050000e-01 4.11050000e-01
 4.16000000e-01 4.21200000e-01 4.25300000e-01 4.31050000e-01]

The master channel is corrupted in the second file.

I've not been able to replicate your images as I don't seem to have that view in v5.4.1 of the GUI.

Check out the latest development branch code

MatinF commented 5 years ago

Hi again Daniel,

I just tried in the latest release 5.4.1 to run below code:

from asammdf import MDF
mdf = MDF(r'input/AC6013CD_00003277_00000005.mf4')
print(mdf.get_master(0)[:100])

Which returns:

[0.      0.00495 0.00995 0.015   0.02    0.02495 0.03    0.03505 0.04
 0.045   0.05    0.055   0.05995 0.065   0.06995 0.07495 0.08    0.085
 0.09    0.095   0.09995 0.10495 0.11    0.115   0.11995 0.125   0.13
 0.13495 0.14    0.145   0.1501  0.155   0.16005 0.16495 0.17    0.175
 0.18005 0.18505 0.19    0.19505 0.2     0.205   0.21    0.215   0.22
 0.22505 0.23    0.235   0.24    0.245   0.25    0.255   0.25995 0.265
 0.27    0.27505 0.28    0.28505 0.29    0.295   0.3     0.305   0.31
 0.315   0.32005 0.32505 0.33005 0.335   0.34    0.345   0.35005 0.35505
 0.36005 0.36505 0.37005 0.37505 0.38005 0.385   0.39005 0.395   0.40005
 0.40505 0.41005 0.41505 0.4202  0.4252  0.4301  0.4351  0.4401  0.44505
 0.45005 0.45515 0.46015 0.46515 0.4702  0.47505 0.48    0.48515 0.49015
 0.4952 ]

I seem to get the same result when checking out & updating to the latest development branch. Maybe I'm missing something basic, though.

danielhrisca commented 5 years ago

I'm using the files from the archive. Double check if you use the same files

MatinF commented 5 years ago

Test.zip

Hi again Daniel,

I just double checked - here are the files I use in my code test, producing the above output with the latest development code. Please let me know if you're still getting a different result with these. The file also loads as expected in CANalyzer e.g. except that it does not have an end point timestamp - though I don't assume that should cause an issue in this regard.

Again, the main isssue I've observed so far does not seem to be a corrupt timestamp overall, but just that my local time is turned into UTC - and that my DBC scaled MDF has a PC time as start_time instead of my original local time.

Happy to do any further tests on this!

MatinF commented 5 years ago

Hi again Daniel,

I see the corruption now in some of the files - thanks for pointing that out.

I've attached some of the sample files which do not seem to have the corruption issue when reviewing the raw log files. I believe these should be usable for illustrating the local vs UCT time, see below pictures:

Samples_v2.zip

Original file (first one): Original

Concatenated file: Note, the concatenated file also is not able to load in Vector's CANalyzer (it returns a 'Warning: The configured time interval is outside the logging files' scope' - not sure if that hints at anything). Concatenated

Scaled file: Scaled

Thanks again for all your fantastic work Daniel, Martin

MatinF commented 5 years ago

Hi Daniel,

just FYI I tested the above with an updated version of the log files (without the time corruption), however I still get the same result. If you need me to do some specific tests or replication code, just let me know.

Best, Martin

danielhrisca commented 5 years ago

Hello Martin,

I've made some improvements to the start time handling. Python's datetime only supports microseconds resolution so you will see some minor differences if you have nanoseconds resolution in your original files

MatinF commented 5 years ago

Hi Daniel, both the local time and the scaled time have been corrected now - it seems to work just as intended :-) awesome work!