Closed cristi-neagu closed 4 years ago
Also, as a side note, when irregular time channels exist in a file, a resampling will fail, creating an empty master channel.
Hi,
not all(diff(timevect) > 0)
, worse than unevenly spaced time sample (error not easy to understand), some samples goes back in time -> another treatment is needed I guess (like you proposed with some kind of cut for instance) or there is issue with mdfreader (worth going in depth for those faulty files, other tools confirm this error ?)np.where(np.diff(masterChannel) < 0)
would give out the index where the discontinuity occurs.There is an issue with using the cut function as it is right now. If we want to cut a master channel containing [0, 1, 2, 3, 4, 5, 6, 0, 0]
at time 6, searchsorted
return an index of 8, which will leave the file unchanged. I think this is because it looks for the condition a(n) < x < a(n+1)
, which fails when the array isn't sorted. Some other methods needs to be implemented for this particular case. Maybe by being able to optionally specify an index instead of a time.
Maybe using masked arrays could be most easy:
data = data.view(MaskedArray)
data.mask = np.where(np.diff(masterChannel) < 0)
It is found in apply_invalid_bit, applicable for mdf4, you tried it ?
Would that allow the file to be resampled?
Interp should not work but it would be possible to use compressed() method to clean up the data before interp and identify mask using masked_where(np.diff(masterChannel) < 0)
Hi, I made a protype in dev branch, adding new function _clean_uneven_master_data You can try it. However, if it works, it will principally targetting your specific case with zero time sample. More generic mask should be considered.
I will try it, thank you. But it would be interesting to know if anyone has encountered any other failure modes.
I haven't got around to testing this yet, but it occured to me that we've been going about this the wrong way. Instead of cutting data, the correct solution (for this case, at least) is to rebuild the time vector. The base assumption has to be that the time is constantly increasing. I'm not sure how true that is for everyone else, but for my data that is true in 99% of cases. Find the time step, fill the zeros.
It could be good for you. However, not good to generalise, this behaviour from your recorder could be specific. Did you check if your channel has invalid bits (should be)? Then you could use apply_invalid_bit() to transform array into masked array and the rest should work transparently.
I introduced in dev branch new method resample_group() in order to be more compliant to mdf4 and its various possible master types, it was not making sense to brutely resample all data without considering its type. So I split out the general resample() resampling of one group which has argument new_master_data. You could use it to resample with your own fixed time signal.
Hello,
I sometimes get this error with certain files. This has nothing to do with MDFReader and it is caused by the recording software messing up the file somehow. Even so, it would be very useful if we had two utilities to deal with this:
A function to detect when this happens, that would return a list of tuples (or a list of dictionaries) with the name of the master channel and the index where the discontinuity occurs.
A utility for slicing a recording, taking the start and end of the slice as inputs, and either modifying the current file or, even better, return a new object containing only the desired data.
I can probably have a go at the first function and make a pull request, but the second one requires a more in depth knowledge of the object structure.
Do you think this is feasible?