thouis / numpy-trac-migration

numpy Trac to github issues migration
2 stars 3 forks source link

Segfault with python 3.2 structured array non-existent field (Trac #1770) #5568

Open numpy-gitbot opened 12 years ago

numpy-gitbot commented 12 years ago

Original ticket http://projects.scipy.org/numpy/ticket/1770 on 2011-03-13 by atmention:matthew-brett, assigned to unknown.

$ python3.2
Python 3.2 (r32:88452, Feb 20 2011, 11:12:31)
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> a = np.zeros((1,), dtype=[('f1', 'f')])
>>> a['f1'] = 1
>>> a['f2'] = 1
Segmentation fault

All tests pass with np.test()

Expected behavior with same code on python2.6:

>>> a['f2'] = 1
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
ValueError: field named f2 not found.

Discussion on the mailing list with email of same title as this ticket.

Christoph Gohlke suggested the following patch:

diff --git a/numpy/core/src/multiarray/mapping.c b/numpy/core/src/multiarray/mapping.c
index 8db85bf..3a72811 100644
--- a/numpy/core/src/multiarray/mapping.c
+++ b/numpy/core/src/multiarray/mapping.c
atmention:atmention: -812,10 +812,16 atmention:atmention: array_ass_sub(PyArrayObject *self, PyObject *index, PyObject *op)
                }
            }
        }
-
+#if defined(NPY_PY3K)
+        PyErr_Format(PyExc_ValueError,

+                     "field named %S not found.",
+                     index);
+#else

        PyErr_Format(PyExc_ValueError,
                     "field named %s not found.",
                     PyString_AsString(index));
+#endif
+
        return -1;
    }
numpy-gitbot commented 12 years ago

atmention:charris wrote on 2011-03-13

There are two more occurrences of this construction in descriptor.c.

numpy-gitbot commented 12 years ago

atmention:pv wrote on 2011-03-15

The Python 2 should also be fixed: "index" can be also a Unicode object, so that using PyString_AsString is not correct.

numpy-gitbot commented 12 years ago

atmention:charris wrote on 2011-03-15

Fixed in c1bec1d, thanks.

numpy-gitbot commented 12 years ago

atmention:cgohlke wrote on 2011-03-16

How about doing this pedestrian? Works for me on win-amd64 with Python 2.6, 2.7, 3.1 and 3.2.

diff --git a/numpy/core/src/multiarray/mapping.c b/numpy/core/src/multiarray/mapping.c
index 8db85bf..20c3501 100644
--- a/numpy/core/src/multiarray/mapping.c
+++ b/numpy/core/src/multiarray/mapping.c
atmention:atmention: -812,10 +812,24 atmention:atmention: array_ass_sub(PyArrayObject *self, PyObject *index, PyObject *op)
                 }
             }
         }
-
-        PyErr_Format(PyExc_ValueError,
-                     "field named %s not found.",
-                     PyString_AsString(index));
+#if defined(NPY_PY3K)
+        PyErr_Format(PyExc_ValueError, "field named %S not found.", index);
+#else
+        if (PyUnicode_Check(index)) {
+            PyObject* temp = PyUnicode_AsUnicodeEscapeString(index);
+            if (temp == NULL) {
+                Py_DECREF(temp);
+                return -1;
+            }
+            PyErr_Format(PyExc_ValueError, "field named %s not found.",
+                         PyString_AsString(temp));
+            Py_DECREF(temp);
+        }
+        else {
+            PyErr_Format(PyExc_ValueError, "field named %s not found.",
+                         PyString_AsString(index));
+        }
+#endif
         return -1;
     }
numpy-gitbot commented 12 years ago

atmention:charris wrote on 2011-03-18

I think PyString_AsString already does the unicode decoding:

char* PyString_AsString(PyObject *string)¶ Return a NUL-terminated representation of the contents of string. The pointer refers to the internal buffer of string, not a copy. The data must not be modified in any way, unless the string was just created using PyString_FromStringAndSize(NULL, size). It must not be deallocated. If string is a Unicode object, this function computes the default encoding of string and operates on that. If string is not a string object at all, PyString_AsString() returns NULL and raises TypeError.

Does the return does need to be checked for NULL?

numpy-gitbot commented 12 years ago

atmention:rgommers wrote on 2011-03-29

Matthew wrote some tests for this issue that were committed, but it looks like there some things to be resolved here. Do I understand correctly that the segfault has been resolved and this is not a release blocker for 1.6.0 anymore?

numpy-gitbot commented 12 years ago

atmention:pv wrote on 2011-03-29

The Python 3 issue has been fixed, and as Chuck notes above, it doesn't crash on Python 2.

However, if I understand correctly, PyString_AsString leaks a reference if it encounters an unicode object, so this could be also fixed. But that's a pretty low-priority issue.

numpy-gitbot commented 12 years ago

atmention:pv wrote on 2011-03-29

Oops, sorry, it does crash on Python 2 --- but that's only when passing non-ASCII non-existing field names to getitem. Needs to be fixed, but probably not a blocker if we're in a hurry -- the bug has always been there.