manodeep / Corrfunc

⚡️⚡️⚡️Blazing fast correlation functions on the CPU.
https://corrfunc.readthedocs.io
MIT License
165 stars 50 forks source link

TypeError when Python C extension returns None #157

Closed lgarrison closed 6 years ago

lgarrison commented 6 years ago

Whenever the Python C extension throws an error (e.g. due to input validation), None gets returned to the Python wrapper. However, we then try to unpack that as a 2-tuple of (extn_results, api_time), which results in a TypeError and the error message 'NoneType' object is not iterable, which isn't helpful to the user.

This is less confusing on the command line when the error message from stderr is also immediately visible, but it's pretty confusing in a Jupyter notebook since the error message isn't captured and displayed there.

Here's an example that demonstrates the issue:

import Corrfunc.theory.DD
import numpy as np

Corrfunc.theory.DD(1, 1, np.array([0., 1.]), np.array([1.]),np.array([1.]),np.array([1.]), boxsize=1.)

In a Notebook only the following is displayed:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-b20e8cca19cf> in <module>()
      2 import numpy as np
      3 
----> 4 Corrfunc.theory.DD(1, 1, np.array([0., 1.]), np.array([1.]),np.array([1.]),np.array([1.]), boxsize=1.)

~/Corrfunc/Corrfunc/theory/DD.py in DD(autocorr, nthreads, binfile, X1, Y1, Z1, weights1, periodic, X2, Y2, Z2, weights2, verbose, boxsize, output_ravg, xbin_refine_factor, ybin_refine_factor, zbin_refine_factor, max_cells_per_dim, c_api_timer, isa, weight_type)
    236                                      max_cells_per_dim=max_cells_per_dim,
    237                                      c_api_timer=c_api_timer,
--> 238                                      isa=integer_isa, **kwargs)
    239     if extn_results is None:
    240         msg = "RuntimeError occurred"

TypeError: 'NoneType' object is not iterable

The simple fix is to just check if the returned result is None before unpacking. However, I propose we actually try to capture the stderr error message and display it to the user. One way to do this would be with the wurlitzer package. This section of DD.py would then become:

from wurlitzer import pipes

with pipes() as (out, err):
    extn_results = DD_extn(autocorr, nthreads, rbinfile,
                                     X1, Y1, Z1,
                                     periodic=periodic,
                                     verbose=verbose,
                                     boxsize=boxsize,
                                     output_ravg=output_ravg,
                                     xbin_refine_factor=xbin_refine_factor,
                                     ybin_refine_factor=ybin_refine_factor,
                                     zbin_refine_factor=zbin_refine_factor,
                                     max_cells_per_dim=max_cells_per_dim,
                                     c_api_timer=c_api_timer,
                                     isa=integer_isa, **kwargs)
if extn_results is None:
    msg = err.read()
    raise RuntimeError(msg)
else:
    extn_results, api_time = extn_results

Then you get the real error message, in a Notebook or command line:

RuntimeError: ../../utils/gridlink_impl_double.c> ERROR:  nlattice = 2 is so small that with periodic wrapping the same cells will be counted twice ....exiting
[...]

@manodeep, what do you think? wurlitzer is on pip for Python 2 and 3.

manodeep commented 6 years ago

While I am hesitant to add dependencies, looks like wurlitzer itself does not have any further dependencies.

Definitely a welcome improvement in terms of usability. Do you mind propagating this to all the wrappers?

lgarrison commented 6 years ago

Digging a little more, I think there's a better solution (but with some complications). We can use wurlitzer.sys_pipes() to get the live stderr stream in the notebook, which is good for progress bars and also works for error messages. So I think the RuntimeError would just go back to the generic "A runtime error occurred" message, which is probably okay.

But this doesn't work on the command line, because sys_pipes() can't redirect stderr to itself. But we can check if stderr is a TTY and redirect based on that. I'll open a PR to try it out.