pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
42.76k stars 17.62k forks source link

0.9.1 test failures on big endian machines #2318

Closed juliantaylor closed 11 years ago

juliantaylor commented 11 years ago

see https://launchpadlibrarian.net/123637321/buildlog_ubuntu-raring-powerpc.pandas_0.9.1-1ubuntu1_FAILEDTOBUILD.txt.gz

https://buildd.debian.org/status/package.php?p=pandas&suite=experimental

======================================================================
ERROR: test_fperr_robustness (pandas.stats.tests.test_moments.TestMoments)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/build/buildd/pandas-0.9.1/debian/python-pandas/usr/lib/python2.7/dist-packages/pandas/stats/tests/test_moments.py", line 175, in test_fperr_robustness
    result = mom.rolling_sum(arr, 2)
  File "/build/buildd/pandas-0.9.1/debian/python-pandas/usr/lib/python2.7/dist-packages/pandas/stats/moments.py", line 458, in f
    freq=freq, time_rule=time_rule, **kwargs)
  File "/build/buildd/pandas-0.9.1/debian/python-pandas/usr/lib/python2.7/dist-packages/pandas/stats/moments.py", line 261, in _rolling_moment
    result = np.apply_along_axis(calc, axis, values)
  File "/usr/lib/python2.7/dist-packages/numpy/lib/shape_base.py", line 80, in apply_along_axis
    res = func1d(arr[tuple(i.tolist())],*args)
  File "/build/buildd/pandas-0.9.1/debian/python-pandas/usr/lib/python2.7/dist-packages/pandas/stats/moments.py", line 258, in <lambda>
    calc = lambda x: func(x, window, minp=minp, **kwargs)
  File "/build/buildd/pandas-0.9.1/debian/python-pandas/usr/lib/python2.7/dist-packages/pandas/stats/moments.py", line 456, in call_cython
    return func(arg, window, minp, **kwds)
  File "moments.pyx", line 156, in pandas.lib.roll_sum (pandas/src/tseries.c:77996)
ValueError: Little-endian buffer not supported on big-endian compiler

======================================================================
FAIL: test_from_M8_structured (pandas.tseries.tests.test_timeseries.TestLegacySupport)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/build/buildd/pandas-0.9.1/debian/python-pandas/usr/lib/python2.7/dist-packages/pandas/tseries/tests/test_timeseries.py", line 1917, in test_from_M8_structured
    self.assertEqual(df['Date'][0], dates[0][0])
AssertionError: <Timestamp: 1976-02-12 15:25:34.986240> != datetime.datetime(2012, 9, 9, 0, 0)
juliantaylor commented 11 years ago

the first should probably be skipped on foreign endian machines, or the input data swapped before trying frombuffer

the second can be reproduced on little endian with:

import pandas
import numpy as np
from datetime import datetime
d = datetime(2012, 9, 9, 0, 0)
arr = np.array([d], dtype='>M8[us]')
arr[0] == d
df = pandas.DataFrame(arr)
df[0][0] == d
print df[0][0]
print arr[0]
juliantaylor commented 11 years ago

the problem is arr.view(np.int64) in cast_to_nanoseconds it does not take the endianess into account:

In [15]: a = np.array([1347148800000000], dtype=">i8")

In [16]: a.view(np.int64)
Out[16]: array([192986734986240])
ghost commented 11 years ago

The first test has a comment saying it should be removed when 2.5 is no longer supported (which has already happend).

The second test creates data specifically in little endian, regardless of the base endianess. I think it might be the test rather then the code which does the wrong thing.

@juliantaylor, does the following raise an error on your test system?

from pandas import *
dates = [ (datetime(2012, 9, 9, 0, 0),
           datetime(2012, 9, 8, 15, 10))]
arr = np.array(dates,
               dtype=[('Date', 'M8[us]'), ('Forecasting', 'M8[us]')])
df = DataFrame(arr)

assert(df['Date'][0]== dates[0][0])
juliantaylor commented 11 years ago

I agree its probably more likely a problem of the test than the code. But it might be useful to catch wrong endianess at a high level before you get wrong results.

unfortunately I don't have direct access to a big endian machine, but your code is essentially what happens in x86 were the tests succeed.

ghost commented 11 years ago

2359 needs testing.