maclandrol / FisherExact

Fisher exact test for mxn contingency table in python
MIT License
37 stars 12 forks source link

segfault is workspace is not big enough, should fail and show warning #8

Open orena1 opened 5 years ago

orena1 commented 5 years ago

This simple example fails with the defualt workspace

import FisherExact
fisher_exact = FisherExact.fisher_exact
fisher_exact([[1088, 126, 342, 516, 594, 578, 528, 378, 272, 160, 68, 40, 22, 4, 2], 
[12, 1, 5, 4, 5, 1, 2, 1, 0, 0, 0, 0, 0, 0, 0]])

But when I increase it to 13000 it works

import FisherExact
fisher_exact = FisherExact.fisher_exact
fisher_exact([[1088, 126, 342, 516, 594, 578, 528, 378, 272, 160, 68, 40, 22, 4, 2], 
[12, 1, 5, 4, 5, 1, 2, 1, 0, 0, 0, 0, 0, 0, 0]],
workspace=13000)

In R if I use a workspace which is not big enough it just fails and say that the wrokspace is not big enough, but it does not give a segfault.

Thanks!

maclandrol commented 5 years ago

Hi @orena1,

It's because the R implementation has a C backend (code converted from fortran), while mine use f2py for wrapping around the fortran source code.

Unfortunately, the stop statement in fortran kill the python process. This was already reported as an issue in #7 . I have tried the approach proposed here, https://github.com/pearu/f2py/wiki/FAQ2ed but it did not have the desired result.

I am now busy with other projects, but will try to look into it again during the next week (I will try this: https://github.com/pearu/f2py/issues/39 ). My long term goal for this is to actually ditch the fortran source code entirely for a C back-end.

Right now, your alternatives (if you are in a hurry) are:

1- use a larger workspace by default 2- Call the binary of the fisher exact as a system call, then check the return value and parse the program output if successful.

I personally don't like any of these options and will prefer that the code raises an Exception instead. So ping me back during the week to check if I made any progress on this.

orena1 commented 5 years ago

It would be great if when you change to C back-end, you'll add support to multithreading. Thanks!