msel-source / pymef

Python wrapper for MEF library
Other
5 stars 3 forks source link

Bug since v1.4.0 - earliest_start_time = 0 #38

Closed cimbi closed 1 year ago

cimbi commented 1 year ago

Problem Description

In session metadata earilest_start_time = 0 since version 1.4.0

What steps reproduce the problem?

  1. Read any mef sesison and do ms.session_md['session_specific_metadata']['earliest_start_time']

Pymef version and platform

1.4.0 Ubunut

cimbi commented 1 year ago

@MaxvandenBoom any idea where this might have happened?

MaxvandenBoom commented 1 year ago

Hey Jan,

Not yet, but I’ll look into it tomorrow.

On Wednesday, August 23, 2023, Jan Cimbalnik @.***> wrote:

@MaxvandenBoom https://github.com/MaxvandenBoom any idea where this might have happened?

— Reply to this email directly, view it on GitHub https://github.com/msel-source/pymef/issues/38#issuecomment-1689461805, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKNHHUD7BQRDD6W72TTDLM3XWWY6NANCNFSM6AAAAAA33BJIUI . You are receiving this because you were mentioned.Message ID: @.***>

MaxvandenBoom commented 1 year ago

@cimbi Hey Jan,

I'm trying to reproduce this, but when I execute:

from pymef.mef_file.pymef3_file import read_mef_session_metadata
session_md = read_mef_session_metadata(".\sub-01_ieeg.mefd", '')
print(session_md['session_specific_metadata']['earliest_start_time'])

it gives me a value of [76663685058], which seems ok.

In the same way as:

from pymef.mef_session import MefSession
session = MefSession(".\sub-01_ieeg.mefd", '')
print(session.session_md['session_specific_metadata']['earliest_start_time'])

gives me the same valid output of [76663685058].

Would you be able to share a dataset (perhaps by e-mail) in which this happens?

Best, Max

MaxvandenBoom commented 1 year ago

@cimbi Hi Jan, I looked into this.

The reason this is happening is because pymef 1.3.6 (released 27 april 2023) was not actually build on the most recent version of meflib at that time, but a much older one! This likely happened because the meflib submodule reference in the pymef project was not updated, which means a number of meflib bugfixes are not included in pymef 1.3.6.


For the specific dataset you sent me, the value of 0 is actually what is expected per: https://github.com/msel-source/meflib/pull/30 This issue was addressed in meflib on 10 Feb 2022, and therefore should have already been fixed in pymef 1.3.6 (which currently still gives the unexpected value of 1493202549000000).

The .tmet has a recording_time_offset of 1493202549000000 and upon reading starts with

start_time = 0               (0 or positive = not offset)
end_time = -1937183800       (negative = offset)

the remove_recording_time_offset function is then applied to both values.

In the older version of meflib this would result in:

start_time= 1493202549000000
end_time = 1493204486183800

In the newer version of meflib this results in:

start_time = 0
end_time = 1493204486183800

I remember that, besides that github issue, @dancrepeau and I had a discussion separately on Zoom about the case of time being 0.

As I remember, we discussed/reasoned/concluded at the time that: 1) A time value of 0 is an edge case, being neither negative or positive. 2) It seemed unintuitive/unexpected that 0 would all of a sudden be offset, while a value of 1 would not. 3) A time value of 0 does not really makes sense, since the recording would have been done in 1970. Therefore, this value will never occur in the data unless the data is artificially generated or anonymized. If artificially generated and anonymized with a year of 1970, a recording_time_offset does not make any sense either and should be set to no_entry (or 0). 4) The MEF3 specification states "Times that have been offset are made negative to indicate this status". So we interpreted that as additional support for the case of considering both 0 and positive as having no offset applied.

What are your thoughts on this?

cimbi commented 1 year ago

Hi @MaxvandenBoom, sorry I dropped out for while. The explanation seems reasonable. But I do not understand why start time is 0 (without applied offset) and end time is correct. I hope I won't have to fix all our recordings :-). In our recordings we have offset set to the beginning of the recording so the start sample time (without offset) is 0

cimbi commented 1 year ago

Sorry looking at your explanation again I see where the problem is. How do you have this set up in your recordings? Do you apply some standard offset to all of them? I guess I will have to fix our recordings...

MaxvandenBoom commented 1 year ago

Hey Jan! No problem, glad your back in action :) Yeah, so I'm a bit on the fence about what we should do...

What we did with our data is to not set a recording_time_offset (no_entry or 0) and - when anonymizing - just set start_time to 0 and end_time to whatever the length of the data is. But because our iEEG datasets are not longitudinal (it's often just minutes to an hour of data) we almost never use date/time to lookup segments of data (in the way that they do in Greg's lab).

However, before, I might have missed that the exact purpose of recording_time_offset is anonymization while keeping the time intact. There is a case to be made to keep the time intact and still anonymizing by shifting it a couple of days with recording_time_offset. Right? Or did I misunderstood this? Then again, wouldn't just shifting the start_time and end_time values by a couple of days achieve the same thing?

In your example the start_time is 0 and the recording_time_offset is 1493202549000000, Which means the recording start_time should become 26 april 2017 12:29:09. Which makes me guess you're not shifting it a couple of days for anonymization, but instead are using it to set a specific start_time. Sorry, if I don't know enough about recording_time_offset usages. Maybe you can help me understand what the use-case is for using recording_time_offset like you did in your example?

cimbi commented 1 year ago

This was a mistake on my side from the beginning. Closing...