msel-source / pymef

Python wrapper for MEF library
Other
5 stars 3 forks source link

Segmentation fault when initializing MefSession #13

Closed alexmcmaster closed 5 years ago

alexmcmaster commented 5 years ago

Problem Description

Context: I am upgrading from pymef v0.1.6 to v1.1.1 in a project that has been using v0.1.6 successfully for some time. Thank you for pymef/meflib!

The MefSession constructor crashes when passed the path to a MEF session with no password (at either level). I do not have any password-protected MEF sessions to test whether or not the lack of password is a causative factor, but the crash happens during password-related activity (explained below), so it seems like a reasonable guess.

I am calling the constructor as follows:

MefSession(path, None, True, False, False)

By instrumenting the code I have worked out the stack trace: MefSession()

  _check_password()  (mef_session.py:120)
    check_mef_password()  (mef_session.py:179)
      extract_terminal_password_bytes()  (pymef3_file.c:4201)
        UTF8_nextchar()  (meflib.c:2453)
  ...

It looks like None is being interpreted as NULL (I think this is the correct ctypes behavior), while down the chain meflib is expecting a non-null value and winds up trying to index against a null pointer. When I was using pymef v0.1.6 I used None successfully to indicate "no password". Is there a new way to indicate "no password" in v1.1.1? Apologies if I missed this in the documentation.

Passing an empty string instead of None as the password in MefSession() fixes this crash, however, the routine still crashes a little later on along with the warning:

process_password_data(), line 5034: unspecified password is not of valid form

..so this doesn't appear to be correct.

What steps reproduce the problem?

Explained above. Happy to go into more detail if desired.

Paste Traceback/Error Below (if applicable)

Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

Pymef version and platform

pymef v1.1.1 (installed via Anaconda3) Ubuntu 18.04

cimbi commented 5 years ago

Thanks for the detailed description! I'll take a look at this. Password handling in pymef is not ideal right now and the solution is long overdue. In the meantime - could you use empty string "" instead? That might work.

alexmcmaster commented 5 years ago

Unfortunately no, I have tried the following, all of which result in the same behavior.

0 "" "\0" bytes(16)

cimbi commented 5 years ago

The empty string works on our unencrypted data without any issues (with the warning though). The error you are seeing seems to be a segmentation fault which might be a bug somewhere else or it could connected to your data. Anyway, I will try to resolve this issue ASAP and we will see if it helps with the segfaults or not.

cimbi commented 5 years ago

Could you please try with the new pymef 1.1.3? I introduced better password handling and fixed a bug that caused segfaults on windows (#11 ). It might be related.

alexmcmaster commented 5 years ago

Unfortunately no luck with 1.1.3, same behavior. I did notice something interesting though - I am able to read the session's metadata successfully by passing a None password to pymef3_file.read_mef_session_metadata() or pymef3_file.read_mef_channel_metadata(), which makes me wonder if password handling is a red herring.

It looks like these calls diverge from the call to the MefSession constructor at pymef3_file.c:4058, where PyArray_DescrConverter returns 1 (success, I think) when called under pymef3_file.read_mef_session_metadata() or pymef3_file.read_mef_channel_metadata(), but 0 (failure, I think) when called under the MefSession constructor.

cimbi commented 5 years ago

MefSession calls the pymef3_file functions. It should be no different. I am not able to replicate the issue unfortunately.

Could you perhaps send me an exaple session where the problem occurs? If you cannot I'll keep on trying to replicate the behavior.

Also what version of NumPy do you use?

alexmcmaster commented 5 years ago

Using numpy 1.17.0.

I shared a Google Drive link to the smallest session that I can reproduce this on (~75MB, too big for GH), let me know if that doesn't work. It's canine, FYI.

cimbi commented 5 years ago

sorry, I don't see the link anywhere. Could you just email the link? jan.cimbalnik@fnusa.cz

cimbi commented 5 years ago

OK, I think I know where the problem is. You are using the low level API (pymef3_file). There is nothing wrong with that but you have to be careful with it. The other thing is that I forgot to change the docstring of the read_ts_data function (fixed by a04aa2ffcaad9d46a0ee56fa8b8adc96bf315c7b).

Anyway, I tried this:

# ----- High level API -----
path='/path/to/session.mfed'
ms = MefSession(path, None, True, False, False)
ms = MefSession(path, '', True, False, False)
print(ms.read_ts_channel_basic_info())
data = ms.read_ts_channels_sample('e1-e2', [None, None])
print(data)
ms.close()

# ----- Low level API -----
from pymef.mef_file import pymef3_file
md = pymef3_file.read_mef_session_metadata(path, None)

# This works
data = pymef3_file.read_mef_ts_data(md['time_series_channels']['e1-e2']['channel_specific_metadata'], 0, 5000)

# This throws password error
data = pymef3_file.read_mef_ts_data(path, None, 0, 5000)

Please preferably use the high level API of mef session instead of the low level API. If this tackles your problem feel free to close the issue.

alexmcmaster commented 5 years ago

This didn't fix the issue, it's actually the high-level API that is segfaulting, while the low-level API is working for me. However, if you're unable to reproduce the issue on the same data then there is probably something more subtle at work in my code that requires further inspection.

I'll reopen if I can find a simple code snippet to reproduce this behavior.

alexmcmaster commented 5 years ago

Fixed by reverting to numpy 1.16.4.

cimbi commented 5 years ago

OK, thanks for letting me know. I tried with numpy 1.17.0 and 1.17.1 and the issue did not surface with either of them. Let me know if you ever find what was causing the problem