pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.27k stars 17.8k forks source link

pandas 0.23.4 fails unit tests #23638

Closed bodgerer closed 5 years ago

bodgerer commented 5 years ago

Problem description

Cannot seem to complete unit tests on a source install of pandas 0.23.4. I'm sure I'm doing something dumb, but cannot see what. Any ideas, please?

Code Sample, a copy-pastable example if possible

$ python
Python 3.6.0 (default, Feb 22 2017, 16:36:12) 
[GCC 6.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas
>>> pandas.test()
running: pytest --skip-slow --skip-network /somewhere/py_new/build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas
============================= test session starts ==============================
platform linux -- Python 3.6.0, pytest-3.9.3, py-1.7.0, pluggy-0.8.0
rootdir: /somewhere/py_new, inifile:
plugins: cov-2.6.0
collected 26659 items / 3 errors / 2 skipped

==================================== ERRORS ====================================
 ERROR collecting build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/indexes/datetimes/test_misc.py 
build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/indexes/datetimes/test_misc.py:90: in <module>
    class TestDatetime64(object):
build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/indexes/datetimes/test_misc.py:246: in TestDatetime64
    None] if tm.get_locales() is None else [None] + tm.get_locales())
build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/util/testing.py:456: in get_locales
    x, encoding=pd.options.display.encoding))
E   UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 4: invalid continuation byte
 ERROR collecting build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/scalar/timestamp/test_timestamp.py 
build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/scalar/timestamp/test_timestamp.py:28: in <module>
    class TestTimestampProperties(object):
build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/scalar/timestamp/test_timestamp.py:104: in TestTimestampProperties
    None] if tm.get_locales() is None else [None] + tm.get_locales())
build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/util/testing.py:456: in get_locales
    x, encoding=pd.options.display.encoding))
E   UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 4: invalid continuation byte
 ERROR collecting build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/series/test_datetime_values.py 
build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/series/test_datetime_values.py:25: in <module>
    class TestSeriesDatetimeValues(TestData):
build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/series/test_datetime_values.py:280: in TestSeriesDatetimeValues
    None] if tm.get_locales() is None else [None] + tm.get_locales())
build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/util/testing.py:456: in get_locales
    x, encoding=pd.options.display.encoding))
E   UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 4: invalid continuation byte
=============================== warnings summary ===============================
build/lib/python3.6/site-packages/pytest-3.9.3-py3.6.egg/_pytest/assertion/rewrite.py:294
  /somewhere/py_new/build/lib/python3.6/site-packages/pytest-3.9.3-py3.6.egg/_pytest/assertion/rewrite.py:294: RemovedInPytest4Warning: Fixture "df_letters" called directly. Fixtures are not meant to be called directly, are created automatically when test functions request them as parameters. See https://docs.pytest.org/en/latest/fixture.html for more information.
    six.exec_(co, mod.__dict__)
  /somewhere/py_new/build/lib/python3.6/site-packages/pytest-3.9.3-py3.6.egg/_pytest/assertion/rewrite.py:294: RemovedInPytest4Warning: Fixture "df_letters" called directly. Fixtures are not meant to be called directly, are created automatically when test functions request them as parameters. See https://docs.pytest.org/en/latest/fixture.html for more information.
    six.exec_(co, mod.__dict__)
  /somewhere/py_new/build/lib/python3.6/site-packages/pytest-3.9.3-py3.6.egg/_pytest/assertion/rewrite.py:294: RemovedInPytest4Warning: Fixture "epoch_1960" called directly. Fixtures are not meant to be called directly, are created automatically when test functions request them as parameters. See https://docs.pytest.org/en/latest/fixture.html for more information.
    six.exec_(co, mod.__dict__)
  /somewhere/py_new/build/lib/python3.6/site-packages/pytest-3.9.3-py3.6.egg/_pytest/assertion/rewrite.py:294: RemovedInPytest4Warning: Fixture "epoch_1960" called directly. Fixtures are not meant to be called directly, are created automatically when test functions request them as parameters. See https://docs.pytest.org/en/latest/fixture.html for more information.
    six.exec_(co, mod.__dict__)
  /somewhere/py_new/build/lib/python3.6/site-packages/pytest-3.9.3-py3.6.egg/_pytest/assertion/rewrite.py:294: RemovedInPytest4Warning: Fixture "epoch_1960" called directly. Fixtures are not meant to be called directly, are created automatically when test functions request them as parameters. See https://docs.pytest.org/en/latest/fixture.html for more information.
    six.exec_(co, mod.__dict__)
  /somewhere/py_new/build/lib/python3.6/site-packages/pytest-3.9.3-py3.6.egg/_pytest/assertion/rewrite.py:294: RemovedInPytest4Warning: Fixture "epoch_1960" called directly. Fixtures are not meant to be called directly, are created automatically when test functions request them as parameters. See https://docs.pytest.org/en/latest/fixture.html for more information.
    six.exec_(co, mod.__dict__)

build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/series/test_analytics.py:1882
  /somewhere/py_new/build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/series/test_analytics.py:1882: RemovedInPytest4Warning: Fixture "s_main_dtypes" called directly. Fixtures are not meant to be called directly, are created automatically when test functions request them as parameters. See https://docs.pytest.org/en/latest/fixture.html for more information.
    class TestNLargestNSmallest(object):

-- Docs: https://docs.pytest.org/en/latest/warnings.html
!!!!!!!!!!!!!!!!!!! Interrupted: 3 errors during collection !!!!!!!!!!!!!!!!!!!!
=============== 2 skipped, 7 warnings, 3 error in 49.24 seconds ================

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.0.final.0 python-bits: 64 OS: Linux OS-release: 3.10.0-693.11.6.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.utf8 LOCALE: en_GB.UTF-8 pandas: 0.23.4 pytest: 3.9.3 pip: 18.1 setuptools: 40.5.0 Cython: 0.29 numpy: 1.15.3 scipy: 1.1.0 pyarrow: None xarray: None IPython: 7.1.1 sphinx: 1.8.1 patsy: None dateutil: 2.7.5 pytz: 2018.7 blosc: None bottleneck: 1.2.1 tables: 3.4.4 numexpr: 2.6.8 feather: None matplotlib: 3.0.1 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: None sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.10 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None
WillAyd commented 5 years ago

Maybe a duplicate of #23088 - is this in a clean environment?

gfyoung commented 5 years ago

And to add to that, can you clarify how you did your install of pandas (i.e. tools / commands used)?

bodgerer commented 5 years ago

Hi both,

Many thanks for taking a look and apologies for not spotting #23088, although the error messages are somewhat different.

It is a clean environment and it's an automated installation, so I can provide all the commands/steps used. However, it's a bit long.

The short method below gives a similar result. I'm on a centos7 box. Am I doing it wrong?

mkdir build src
prefix=`pwd`/build
cd src

# Add a copy of python3 and the compiler used to build it to the environment
module purge
module load python/3.6.0 gnu/6.3.0

export PATH=${prefix}/bin:${PATH}
export CPATH=${prefix}/include:${CPATH}
export LIBRARY_PATH=${prefix}/lib:${LIBRARY_PATH}
export LD_LIBRARY_PATH=${prefix}/lib:${LD_LIBRARY_PATH}
export PYTHONPATH=${prefix}/lib/python3.6/site-packages

pip install --prefix=$prefix pytest numpy scipy

tar xvf pandas-0.23.4.tar.gz
cd pandas-0.23.4
python setup.py build
python setup.py install --prefix="${prefix}"
cd ..

After doing this, attempting to run tests results in:

$ python -c 'import pandas; pandas.test()'

running: pytest --skip-slow --skip-network /somewhere/py_pip2/build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas
============================= test session starts ==============================
platform linux -- Python 3.6.0, pytest-3.10.1, py-1.7.0, pluggy-0.8.0
rootdir: /somewhere/py_pip2, inifile:
collected 26393 items / 3 errors / 5 skipped

==================================== ERRORS ====================================
 ERROR collecting build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/indexes/datetimes/test_misc.py 
../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/indexes/datetimes/test_misc.py:90: in <module>
    class TestDatetime64(object):
../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/indexes/datetimes/test_misc.py:246: in TestDatetime64
    None] if tm.get_locales() is None else [None] + tm.get_locales())
../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/util/testing.py:456: in get_locales
    x, encoding=pd.options.display.encoding))
E   UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 4: invalid continuation byte
 ERROR collecting build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/scalar/timestamp/test_timestamp.py 
../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/scalar/timestamp/test_timestamp.py:28: in <module>
    class TestTimestampProperties(object):
../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/scalar/timestamp/test_timestamp.py:104: in TestTimestampProperties
    None] if tm.get_locales() is None else [None] + tm.get_locales())
../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/util/testing.py:456: in get_locales
    x, encoding=pd.options.display.encoding))
E   UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 4: invalid continuation byte
 ERROR collecting build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/series/test_datetime_values.py 
../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/series/test_datetime_values.py:25: in <module>
    class TestSeriesDatetimeValues(TestData):
../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/series/test_datetime_values.py:280: in TestSeriesDatetimeValues
    None] if tm.get_locales() is None else [None] + tm.get_locales())
../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/util/testing.py:456: in get_locales
    x, encoding=pd.options.display.encoding))
E   UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 4: invalid continuation byte
=============================== warnings summary ===============================
/somewhere/py_pip2/build/lib/python3.6/site-packages/_pytest/assertion/rewrite.py:294
  /somewhere/py_pip2/build/lib/python3.6/site-packages/_pytest/assertion/rewrite.py:294: RemovedInPytest4Warning: Fixture "df_letters" called directly. Fixtures are not meant to be called directly, are created automatically when test functions request them as parameters. See https://docs.pytest.org/en/latest/fixture.html for more information.
    six.exec_(co, mod.__dict__)
  /somewhere/py_pip2/build/lib/python3.6/site-packages/_pytest/assertion/rewrite.py:294: RemovedInPytest4Warning: Fixture "df_letters" called directly. Fixtures are not meant to be called directly, are created automatically when test functions request them as parameters. See https://docs.pytest.org/en/latest/fixture.html for more information.
    six.exec_(co, mod.__dict__)
  /somewhere/py_pip2/build/lib/python3.6/site-packages/_pytest/assertion/rewrite.py:294: RemovedInPytest4Warning: Fixture "epoch_1960" called directly. Fixtures are not meant to be called directly, are created automatically when test functions request them as parameters. See https://docs.pytest.org/en/latest/fixture.html for more information.
    six.exec_(co, mod.__dict__)
  /somewhere/py_pip2/build/lib/python3.6/site-packages/_pytest/assertion/rewrite.py:294: RemovedInPytest4Warning: Fixture "epoch_1960" called directly. Fixtures are not meant to be called directly, are created automatically when test functions request them as parameters. See https://docs.pytest.org/en/latest/fixture.html for more information.
    six.exec_(co, mod.__dict__)
  /somewhere/py_pip2/build/lib/python3.6/site-packages/_pytest/assertion/rewrite.py:294: RemovedInPytest4Warning: Fixture "epoch_1960" called directly. Fixtures are not meant to be called directly, are created automatically when test functions request them as parameters. See https://docs.pytest.org/en/latest/fixture.html for more information.
    six.exec_(co, mod.__dict__)
  /somewhere/py_pip2/build/lib/python3.6/site-packages/_pytest/assertion/rewrite.py:294: RemovedInPytest4Warning: Fixture "epoch_1960" called directly. Fixtures are not meant to be called directly, are created automatically when test functions request them as parameters. See https://docs.pytest.org/en/latest/fixture.html for more information.
    six.exec_(co, mod.__dict__)

/somewhere/py_pip2/build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/indexes/timedeltas/test_ops.py:337
  /somewhere/py_pip2/build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/indexes/timedeltas/test_ops.py:337: DeprecationWarning: invalid escape sequence \*
    msg = '<2 \* BusinessDays> is a non-fixed frequency'

/somewhere/py_pip2/build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/internals/test_internals.py:1292
  /somewhere/py_pip2/build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/internals/test_internals.py:1292: DeprecationWarning: invalid escape sequence \[
    msg = "Wrong number of dimensions. values.ndim != ndim \[1 != 2\]"

/somewhere/py_pip2/build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/series/test_analytics.py:1882
  /somewhere/py_pip2/build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/series/test_analytics.py:1882: RemovedInPytest4Warning: Fixture "s_main_dtypes" called directly. Fixtures are not meant to be called directly, are created automatically when test functions request them as parameters. See https://docs.pytest.org/en/latest/fixture.html for more information.
    class TestNLargestNSmallest(object):

-- Docs: https://docs.pytest.org/en/latest/warnings.html
!!!!!!!!!!!!!!!!!!! Interrupted: 3 errors during collection !!!!!!!!!!!!!!!!!!!!
=============== 5 skipped, 9 warnings, 3 error in 41.29 seconds ================

$ python -c 'import pandas; pandas.show_versions()'

INSTALLED VERSIONS
------------------
commit: None
python: 3.6.0.final.0
python-bits: 64
OS: Linux
OS-release: 3.10.0-693.11.6.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.utf8
LOCALE: en_GB.UTF-8

pandas: 0.23.4
pytest: 3.10.1
pip: 9.0.1
setuptools: 28.8.0
Cython: None
numpy: 1.15.4
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.5
pytz: 2018.7
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
TomAugspurger commented 5 years ago

Can you try building master and seeing what fails? The warnings like /somewhere/py_pip2/build/lib/python3.6/site-packages/_pytest/assertion/rewrite.py:294: RemovedInPytest4Warning: Fixture "df_letters" called directly. Fixtures are not meant to be called directly, are created automatically when test functions request them as parameters. See https://docs.pytest.org/en/latest/fixture.html for more information. have already been fixed.

Can you run pandas.util.testing.get_locales() and post the output?

bodgerer commented 5 years ago

For the above build, pandas.util.testing.get_locales() returns:

$ python -c 'import pandas; pandas.util.testing.get_locales()'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/somewhere/py_pip2/build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/util/testing.py", line 456, in get_locales
    x, encoding=pd.options.display.encoding))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 4: invalid continuation byte

Interestingly, I get a similar error from a fresh python3.7 miniconda environment after doing a conda install pandas.

I've built and run master in a fresh directory using the same method as above, replacing the pandas build with:

pip install --prefix=$prefix cython hypothesis
git clone https://github.com/pandas-dev/pandas.git
cd pandas
python setup.py build
python setup.py install --prefix="${prefix}"
cd ..

I get:

$ python -c 'import pandas; pandas.util.testing.get_locales()'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/somewhere/py_pandas_master/build/lib/python3.6/site-packages/pandas-0.24.0.dev0+992.g20bdb3e-py3.6-linux-x86_64.egg/pandas/util/testing.py", line 467, in get_locales
    x, encoding=pd.options.display.encoding))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 4: invalid continuation byte

And:

$ python -c 'import pandas; pandas.test()'
running: pytest --skip-slow --skip-network /somewhere/py_pandas_master/build/lib/python3.6/site-packages/pandas-0.24.0.dev0+992.g20bdb3e-py3.6-linux-x86_64.egg/pandas
============================= test session starts ==============================
platform linux -- Python 3.6.0, pytest-3.10.1, py-1.7.0, pluggy-0.8.0
hypothesis profile 'ci' -> timeout=-1, deadline=500, suppress_health_check=[HealthCheck.too_slow], database=DirectoryBasedExampleDatabase('/somewhere/py_pandas_master/src/.hypothesis/examples')
rootdir: /somewhere/py_pandas_master, inifile:
plugins: hypothesis-3.82.1
collected 38202 items / 3 errors / 6 skipped

==================================== ERRORS ====================================
 ERROR collecting build/lib/python3.6/site-packages/pandas-0.24.0.dev0+992.g20bdb3e-py3.6-linux-x86_64.egg/pandas/tests/indexes/datetimes/test_misc.py 
../build/lib/python3.6/site-packages/pandas-0.24.0.dev0+992.g20bdb3e-py3.6-linux-x86_64.egg/pandas/tests/indexes/datetimes/test_misc.py:91: in <module>
    class TestDatetime64(object):
../build/lib/python3.6/site-packages/pandas-0.24.0.dev0+992.g20bdb3e-py3.6-linux-x86_64.egg/pandas/tests/indexes/datetimes/test_misc.py:247: in TestDatetime64
    None] if tm.get_locales() is None else [None] + tm.get_locales())
../build/lib/python3.6/site-packages/pandas-0.24.0.dev0+992.g20bdb3e-py3.6-linux-x86_64.egg/pandas/util/testing.py:467: in get_locales
    x, encoding=pd.options.display.encoding))
E   UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 4: invalid continuation byte
 ERROR collecting build/lib/python3.6/site-packages/pandas-0.24.0.dev0+992.g20bdb3e-py3.6-linux-x86_64.egg/pandas/tests/scalar/timestamp/test_timestamp.py 
../build/lib/python3.6/site-packages/pandas-0.24.0.dev0+992.g20bdb3e-py3.6-linux-x86_64.egg/pandas/tests/scalar/timestamp/test_timestamp.py:29: in <module>
    class TestTimestampProperties(object):
../build/lib/python3.6/site-packages/pandas-0.24.0.dev0+992.g20bdb3e-py3.6-linux-x86_64.egg/pandas/tests/scalar/timestamp/test_timestamp.py:105: in TestTimestampProperties
    None] if tm.get_locales() is None else [None] + tm.get_locales())
../build/lib/python3.6/site-packages/pandas-0.24.0.dev0+992.g20bdb3e-py3.6-linux-x86_64.egg/pandas/util/testing.py:467: in get_locales
    x, encoding=pd.options.display.encoding))
E   UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 4: invalid continuation byte
 ERROR collecting build/lib/python3.6/site-packages/pandas-0.24.0.dev0+992.g20bdb3e-py3.6-linux-x86_64.egg/pandas/tests/series/test_datetime_values.py 
../build/lib/python3.6/site-packages/pandas-0.24.0.dev0+992.g20bdb3e-py3.6-linux-x86_64.egg/pandas/tests/series/test_datetime_values.py:27: in <module>
    class TestSeriesDatetimeValues():
../build/lib/python3.6/site-packages/pandas-0.24.0.dev0+992.g20bdb3e-py3.6-linux-x86_64.egg/pandas/tests/series/test_datetime_values.py:322: in TestSeriesDatetimeValues
    None] if tm.get_locales() is None else [None] + tm.get_locales())
../build/lib/python3.6/site-packages/pandas-0.24.0.dev0+992.g20bdb3e-py3.6-linux-x86_64.egg/pandas/util/testing.py:467: in get_locales
    x, encoding=pd.options.display.encoding))
E   UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 4: invalid continuation byte
=============================== warnings summary ===============================
/somewhere/py_pandas_master/build/lib/python3.6/site-packages/_pytest/assertion/rewrite.py:272
  /somewhere/py_pandas_master/build/lib/python3.6/site-packages/_pytest/assertion/rewrite.py:272: PytestWarning: Module already imported so cannot be rewritten: hypothesis
    self.config,

-- Docs: https://docs.pytest.org/en/latest/warnings.html
!!!!!!!!!!!!!!!!!!! Interrupted: 3 errors during collection !!!!!!!!!!!!!!!!!!!!
=============== 6 skipped, 1 warnings, 3 error in 30.89 seconds ================
TomAugspurger commented 5 years ago

That's the full traceback for the unicode error? I was hoping for more.

What's the output of locale -a?

TomAugspurger commented 5 years ago

Should I read anything into the fact that py_pip2/ is in your path? Is that an indication that a pip attached to python 2 was used to install pandas?

The recommended way to invoke pip is python -m pip install pandas hypothesis.

bodgerer commented 5 years ago

I'm afraid that's the full output. locale -a output below.

Don't read anything into py_pip2, it's just my terrible naming of a scratch directory. I've repeated the test by doing a fresh install and using python -m pip install --prefix=$prefix pytest numpy scipy cython hypothesis with the same result.

Setting export LANG=en_GB instead of my machine/account's default of en_GB.utf8 makes the above errors go away and the test suite is able to start running. My build against master reports:

35727 passed, 5143 skipped, 318 xfailed, 6 xpassed, 32 warnings in 678.87 seconds

Whereas my build against 0.23.4 reports:

4 failed, 25058 passed, 4183 skipped, 78 xfailed, 26 xpassed, 163 warnings in 424.04 seconds

(failures are something to do with the months of the year in different languages, don't know if this is interesting?)

$ locale -a

``` aa_DJ aa_DJ.iso88591 aa_DJ.utf8 aa_ER aa_ER@saaho aa_ER.utf8 aa_ER.utf8@saaho aa_ET aa_ET.utf8 af_ZA af_ZA.iso88591 af_ZA.utf8 am_ET am_ET.utf8 an_ES an_ES.iso885915 an_ES.utf8 ar_AE ar_AE.iso88596 ar_AE.utf8 ar_BH ar_BH.iso88596 ar_BH.utf8 ar_DZ ar_DZ.iso88596 ar_DZ.utf8 ar_EG ar_EG.iso88596 ar_EG.utf8 ar_IN ar_IN.utf8 ar_IQ ar_IQ.iso88596 ar_IQ.utf8 ar_JO ar_JO.iso88596 ar_JO.utf8 ar_KW ar_KW.iso88596 ar_KW.utf8 ar_LB ar_LB.iso88596 ar_LB.utf8 ar_LY ar_LY.iso88596 ar_LY.utf8 ar_MA ar_MA.iso88596 ar_MA.utf8 ar_OM ar_OM.iso88596 ar_OM.utf8 ar_QA ar_QA.iso88596 ar_QA.utf8 ar_SA ar_SA.iso88596 ar_SA.utf8 ar_SD ar_SD.iso88596 ar_SD.utf8 ar_SY ar_SY.iso88596 ar_SY.utf8 ar_TN ar_TN.iso88596 ar_TN.utf8 ar_YE ar_YE.iso88596 ar_YE.utf8 as_IN as_IN.utf8 ast_ES ast_ES.iso885915 ast_ES.utf8 ayc_PE ayc_PE.utf8 az_AZ az_AZ.utf8 be_BY be_BY.cp1251 be_BY@latin be_BY.utf8 be_BY.utf8@latin bem_ZM bem_ZM.utf8 ber_DZ ber_DZ.utf8 ber_MA ber_MA.utf8 bg_BG bg_BG.cp1251 bg_BG.utf8 bho_IN bho_IN.utf8 bn_BD bn_BD.utf8 bn_IN bn_IN.utf8 bo_CN bo_CN.utf8 bo_IN bo_IN.utf8 bokmal bokm�l br_FR br_FR@euro br_FR.iso88591 br_FR.iso885915@euro br_FR.utf8 brx_IN brx_IN.utf8 bs_BA bs_BA.iso88592 bs_BA.utf8 byn_ER byn_ER.utf8 C ca_AD ca_AD.iso885915 ca_AD.utf8 ca_ES ca_ES@euro ca_ES.iso88591 ca_ES.iso885915@euro ca_ES.utf8 ca_FR ca_FR.iso885915 ca_FR.utf8 ca_IT ca_IT.iso885915 ca_IT.utf8 catalan crh_UA crh_UA.utf8 croatian csb_PL csb_PL.utf8 cs_CZ cs_CZ.iso88592 cs_CZ.utf8 cv_RU cv_RU.utf8 cy_GB cy_GB.iso885914 cy_GB.utf8 czech da_DK da_DK.iso88591 da_DK.iso885915 da_DK.utf8 danish dansk de_AT de_AT@euro de_AT.iso88591 de_AT.iso885915@euro de_AT.utf8 de_BE de_BE@euro de_BE.iso88591 de_BE.iso885915@euro de_BE.utf8 de_CH de_CH.iso88591 de_CH.utf8 de_DE de_DE@euro de_DE.iso88591 de_DE.iso885915@euro de_DE.utf8 de_LU de_LU@euro de_LU.iso88591 de_LU.iso885915@euro de_LU.utf8 deutsch doi_IN doi_IN.utf8 dutch dv_MV dv_MV.utf8 dz_BT dz_BT.utf8 eesti el_CY el_CY.iso88597 el_CY.utf8 el_GR el_GR.iso88597 el_GR.utf8 en_AG en_AG.utf8 en_AU en_AU.iso88591 en_AU.utf8 en_BW en_BW.iso88591 en_BW.utf8 en_CA en_CA.iso88591 en_CA.utf8 en_DK en_DK.iso88591 en_DK.utf8 en_GB en_GB.iso88591 en_GB.iso885915 en_GB.utf8 en_HK en_HK.iso88591 en_HK.utf8 en_IE en_IE@euro en_IE.iso88591 en_IE.iso885915@euro en_IE.utf8 en_IN en_IN.utf8 en_NG en_NG.utf8 en_NZ en_NZ.iso88591 en_NZ.utf8 en_PH en_PH.iso88591 en_PH.utf8 en_SG en_SG.iso88591 en_SG.utf8 en_US en_US.iso88591 en_US.iso885915 en_US.utf8 en_ZA en_ZA.iso88591 en_ZA.utf8 en_ZM en_ZM.utf8 en_ZW en_ZW.iso88591 en_ZW.utf8 es_AR es_AR.iso88591 es_AR.utf8 es_BO es_BO.iso88591 es_BO.utf8 es_CL es_CL.iso88591 es_CL.utf8 es_CO es_CO.iso88591 es_CO.utf8 es_CR es_CR.iso88591 es_CR.utf8 es_CU es_CU.utf8 es_DO es_DO.iso88591 es_DO.utf8 es_EC es_EC.iso88591 es_EC.utf8 es_ES es_ES@euro es_ES.iso88591 es_ES.iso885915@euro es_ES.utf8 es_GT es_GT.iso88591 es_GT.utf8 es_HN es_HN.iso88591 es_HN.utf8 es_MX es_MX.iso88591 es_MX.utf8 es_NI es_NI.iso88591 es_NI.utf8 es_PA es_PA.iso88591 es_PA.utf8 es_PE es_PE.iso88591 es_PE.utf8 es_PR es_PR.iso88591 es_PR.utf8 es_PY es_PY.iso88591 es_PY.utf8 es_SV es_SV.iso88591 es_SV.utf8 estonian es_US es_US.iso88591 es_US.utf8 es_UY es_UY.iso88591 es_UY.utf8 es_VE es_VE.iso88591 es_VE.utf8 et_EE et_EE.iso88591 et_EE.iso885915 et_EE.utf8 eu_ES eu_ES@euro eu_ES.iso88591 eu_ES.iso885915@euro eu_ES.utf8 fa_IR fa_IR.utf8 ff_SN ff_SN.utf8 fi_FI fi_FI@euro fi_FI.iso88591 fi_FI.iso885915@euro fi_FI.utf8 fil_PH fil_PH.utf8 finnish fo_FO fo_FO.iso88591 fo_FO.utf8 fran�ais fr_BE fr_BE@euro fr_BE.iso88591 fr_BE.iso885915@euro fr_BE.utf8 fr_CA fr_CA.iso88591 fr_CA.utf8 fr_CH fr_CH.iso88591 fr_CH.utf8 french fr_FR fr_FR@euro fr_FR.iso88591 fr_FR.iso885915@euro fr_FR.utf8 fr_LU fr_LU@euro fr_LU.iso88591 fr_LU.iso885915@euro fr_LU.utf8 fur_IT fur_IT.utf8 fy_DE fy_DE.utf8 fy_NL fy_NL.utf8 ga_IE ga_IE@euro ga_IE.iso88591 ga_IE.iso885915@euro ga_IE.utf8 galego galician gd_GB gd_GB.iso885915 gd_GB.utf8 german gez_ER gez_ER@abegede gez_ER.utf8 gez_ER.utf8@abegede gez_ET gez_ET@abegede gez_ET.utf8 gez_ET.utf8@abegede gl_ES gl_ES@euro gl_ES.iso88591 gl_ES.iso885915@euro gl_ES.utf8 greek gu_IN gu_IN.utf8 gv_GB gv_GB.iso88591 gv_GB.utf8 ha_NG ha_NG.utf8 hebrew he_IL he_IL.iso88598 he_IL.utf8 hi_IN hi_IN.utf8 hne_IN hne_IN.utf8 hr_HR hr_HR.iso88592 hr_HR.utf8 hrvatski hsb_DE hsb_DE.iso88592 hsb_DE.utf8 ht_HT ht_HT.utf8 hu_HU hu_HU.iso88592 hu_HU.utf8 hungarian hy_AM hy_AM.armscii8 hy_AM.utf8 ia_FR ia_FR.utf8 icelandic id_ID id_ID.iso88591 id_ID.utf8 ig_NG ig_NG.utf8 ik_CA ik_CA.utf8 is_IS is_IS.iso88591 is_IS.utf8 italian it_CH it_CH.iso88591 it_CH.utf8 it_IT it_IT@euro it_IT.iso88591 it_IT.iso885915@euro it_IT.utf8 iu_CA iu_CA.utf8 iw_IL iw_IL.iso88598 iw_IL.utf8 ja_JP ja_JP.eucjp ja_JP.ujis ja_JP.utf8 japanese japanese.euc ka_GE ka_GE.georgianps ka_GE.utf8 kk_KZ kk_KZ.pt154 kk_KZ.utf8 kl_GL kl_GL.iso88591 kl_GL.utf8 km_KH km_KH.utf8 kn_IN kn_IN.utf8 kok_IN kok_IN.utf8 ko_KR ko_KR.euckr ko_KR.utf8 korean korean.euc ks_IN ks_IN@devanagari ks_IN.utf8 ks_IN.utf8@devanagari ku_TR ku_TR.iso88599 ku_TR.utf8 kw_GB kw_GB.iso88591 kw_GB.utf8 ky_KG ky_KG.utf8 lb_LU lb_LU.utf8 lg_UG lg_UG.iso885910 lg_UG.utf8 li_BE li_BE.utf8 lij_IT lij_IT.utf8 li_NL li_NL.utf8 lithuanian lo_LA lo_LA.utf8 lt_LT lt_LT.iso885913 lt_LT.utf8 lv_LV lv_LV.iso885913 lv_LV.utf8 mag_IN mag_IN.utf8 mai_IN mai_IN.utf8 mg_MG mg_MG.iso885915 mg_MG.utf8 mhr_RU mhr_RU.utf8 mi_NZ mi_NZ.iso885913 mi_NZ.utf8 mk_MK mk_MK.iso88595 mk_MK.utf8 ml_IN ml_IN.utf8 mni_IN mni_IN.utf8 mn_MN mn_MN.utf8 mr_IN mr_IN.utf8 ms_MY ms_MY.iso88591 ms_MY.utf8 mt_MT mt_MT.iso88593 mt_MT.utf8 my_MM my_MM.utf8 nan_TW@latin nan_TW.utf8@latin nb_NO nb_NO.iso88591 nb_NO.utf8 nds_DE nds_DE.utf8 nds_NL nds_NL.utf8 ne_NP ne_NP.utf8 nhn_MX nhn_MX.utf8 niu_NU niu_NU.utf8 niu_NZ niu_NZ.utf8 nl_AW nl_AW.utf8 nl_BE nl_BE@euro nl_BE.iso88591 nl_BE.iso885915@euro nl_BE.utf8 nl_NL nl_NL@euro nl_NL.iso88591 nl_NL.iso885915@euro nl_NL.utf8 nn_NO nn_NO.iso88591 nn_NO.utf8 no_NO no_NO.ISO-8859-1 norwegian nr_ZA nr_ZA.utf8 nso_ZA nso_ZA.utf8 nynorsk oc_FR oc_FR.iso88591 oc_FR.utf8 om_ET om_ET.utf8 om_KE om_KE.iso88591 om_KE.utf8 or_IN or_IN.utf8 os_RU os_RU.utf8 pa_IN pa_IN.utf8 pap_AN pap_AN.utf8 pa_PK pa_PK.utf8 pl_PL pl_PL.iso88592 pl_PL.utf8 polish portuguese POSIX ps_AF ps_AF.utf8 pt_BR pt_BR.iso88591 pt_BR.utf8 pt_PT pt_PT@euro pt_PT.iso88591 pt_PT.iso885915@euro pt_PT.utf8 romanian ro_RO ro_RO.iso88592 ro_RO.utf8 ru_RU ru_RU.iso88595 ru_RU.koi8r ru_RU.utf8 russian ru_UA ru_UA.koi8u ru_UA.utf8 rw_RW rw_RW.utf8 sa_IN sa_IN.utf8 sat_IN sat_IN.utf8 sc_IT sc_IT.utf8 sd_IN sd_IN@devanagari sd_IN.utf8 sd_IN.utf8@devanagari se_NO se_NO.utf8 shs_CA shs_CA.utf8 sid_ET sid_ET.utf8 si_LK si_LK.utf8 sk_SK sk_SK.iso88592 sk_SK.utf8 slovak slovene slovenian sl_SI sl_SI.iso88592 sl_SI.utf8 so_DJ so_DJ.iso88591 so_DJ.utf8 so_ET so_ET.utf8 so_KE so_KE.iso88591 so_KE.utf8 so_SO so_SO.iso88591 so_SO.utf8 spanish sq_AL sq_AL.iso88591 sq_AL.utf8 sq_MK sq_MK.utf8 sr_ME sr_ME.utf8 sr_RS sr_RS@latin sr_RS.utf8 sr_RS.utf8@latin ss_ZA ss_ZA.utf8 st_ZA st_ZA.iso88591 st_ZA.utf8 sv_FI sv_FI@euro sv_FI.iso88591 sv_FI.iso885915@euro sv_FI.utf8 sv_SE sv_SE.iso88591 sv_SE.iso885915 sv_SE.utf8 swedish sw_KE sw_KE.utf8 sw_TZ sw_TZ.utf8 szl_PL szl_PL.utf8 ta_IN ta_IN.utf8 ta_LK ta_LK.utf8 te_IN te_IN.utf8 tg_TJ tg_TJ.koi8t tg_TJ.utf8 thai th_TH th_TH.tis620 th_TH.utf8 ti_ER ti_ER.utf8 ti_ET ti_ET.utf8 tig_ER tig_ER.utf8 tk_TM tk_TM.utf8 tl_PH tl_PH.iso88591 tl_PH.utf8 tn_ZA tn_ZA.utf8 tr_CY tr_CY.iso88599 tr_CY.utf8 tr_TR tr_TR.iso88599 tr_TR.utf8 ts_ZA ts_ZA.utf8 tt_RU tt_RU@iqtelif tt_RU.utf8 tt_RU.utf8@iqtelif turkish ug_CN ug_CN.utf8 uk_UA uk_UA.koi8u uk_UA.utf8 unm_US unm_US.utf8 ur_IN ur_IN.utf8 ur_PK ur_PK.utf8 uz_UZ uz_UZ@cyrillic uz_UZ.iso88591 uz_UZ.utf8@cyrillic ve_ZA ve_ZA.utf8 vi_VN vi_VN.utf8 wa_BE wa_BE@euro wa_BE.iso88591 wa_BE.iso885915@euro wa_BE.utf8 wae_CH wae_CH.utf8 wal_ET wal_ET.utf8 wo_SN wo_SN.utf8 xh_ZA xh_ZA.iso88591 xh_ZA.utf8 yi_US yi_US.cp1255 yi_US.utf8 yo_NG yo_NG.utf8 yue_HK yue_HK.utf8 zh_CN zh_CN.gb18030 zh_CN.gb2312 zh_CN.gbk zh_CN.utf8 zh_HK zh_HK.big5hkscs zh_HK.utf8 zh_SG zh_SG.gb2312 zh_SG.gbk zh_SG.utf8 zh_TW zh_TW.big5 zh_TW.euctw zh_TW.utf8 zu_ZA zu_ZA.iso88591 zu_ZA.utf8 ``` failures from 0.23.4 tests: ``` =================================== FAILURES =================================== __________ TestDatetime64.test_datetime_name_accessors[crh_UA.UTF-80] __________ self = time_locale = 'crh_UA.UTF-8' @pytest.mark.parametrize('time_locale', [ None] if tm.get_locales() is None else [None] + tm.get_locales()) def test_datetime_name_accessors(self, time_locale): # Test Monday -> Sunday and January -> December, in that sequence if time_locale is None: # If the time_locale is None, day-name and month_name should # return the english attributes expected_days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'] expected_months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December'] else: with tm.set_locale(time_locale, locale.LC_TIME): expected_days = calendar.day_name[:] expected_months = calendar.month_name[1:] # GH 11128 dti = DatetimeIndex(freq='D', start=datetime(1998, 1, 1), periods=365) english_days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'] for day, name, eng_name in zip(range(4, 11), expected_days, english_days): name = name.capitalize() assert dti.weekday_name[day] == eng_name assert dti.day_name(locale=time_locale)[day] == name ts = Timestamp(datetime(2016, 4, day)) with tm.assert_produces_warning(FutureWarning, check_stacklevel=False): assert ts.weekday_name == eng_name assert ts.day_name(locale=time_locale) == name dti = dti.append(DatetimeIndex([pd.NaT])) assert np.isnan(dti.day_name(locale=time_locale)[-1]) ts = Timestamp(pd.NaT) assert np.isnan(ts.day_name(locale=time_locale)) # GH 12805 dti = DatetimeIndex(freq='M', start='2012', end='2013') result = dti.month_name(locale=time_locale) expected = Index([month.capitalize() for month in expected_months]) > tm.assert_index_equal(result, expected) ../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/indexes/datetimes/test_misc.py:287: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/util/testing.py:855: in assert_index_equal raise_assert_detail(obj, msg, left, right) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ obj = 'Index', message = 'Index values are different (16.66667 %)' left = Index(['Yanvar', 'Fevral', 'Mart', 'Aprel', 'May\u0131s', 'I\u0307yun', 'I\u0307yul',\n 'Avgust', 'Sent\xe2br', 'Okt\xe2br', 'Noyabr', 'Dekabr'],\n dtype='object') right = Index(['Yanvar', 'Fevral', 'Mart', 'Aprel', 'May\u0131s', '\u0130yun', '\u0130yul', 'Avgust',\n 'Sent\xe2br', 'Okt\xe2br', 'Noyabr', 'Dekabr'],\n dtype='object') diff = None def raise_assert_detail(obj, message, left, right, diff=None): if isinstance(left, np.ndarray): left = pprint_thing(left) elif is_categorical_dtype(left): left = repr(left) if PY2 and isinstance(left, string_types): # left needs to be printable in native text type in python2 left = left.encode('utf-8') if isinstance(right, np.ndarray): right = pprint_thing(right) elif is_categorical_dtype(right): right = repr(right) if PY2 and isinstance(right, string_types): # right needs to be printable in native text type in python2 right = right.encode('utf-8') msg = """{obj} are different {message} [left]: {left} [right]: {right}""".format(obj=obj, message=message, left=left, right=right) if diff is not None: msg += "\n[diff]: {diff}".format(diff=diff) > raise AssertionError(msg) E AssertionError: Index are different E E Index values are different (16.66667 %) E [left]: Index(['Yanvar', 'Fevral', 'Mart', 'Aprel', 'May\u0131s', 'I\u0307yun', 'I\u0307yul', E 'Avgust', 'Sent�br', 'Okt�br', 'Noyabr', 'Dekabr'], E dtype='object') E [right]: Index(['Yanvar', 'Fevral', 'Mart', 'Aprel', 'May\u0131s', '\u0130yun', '\u0130yul', 'Avgust', E 'Sent�br', 'Okt�br', 'Noyabr', 'Dekabr'], E dtype='object') ../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/util/testing.py:1035: AssertionError __________ TestDatetime64.test_datetime_name_accessors[crh_UA.UTF-81] __________ self = time_locale = 'crh_UA.UTF-8' @pytest.mark.parametrize('time_locale', [ None] if tm.get_locales() is None else [None] + tm.get_locales()) def test_datetime_name_accessors(self, time_locale): # Test Monday -> Sunday and January -> December, in that sequence if time_locale is None: # If the time_locale is None, day-name and month_name should # return the english attributes expected_days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'] expected_months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December'] else: with tm.set_locale(time_locale, locale.LC_TIME): expected_days = calendar.day_name[:] expected_months = calendar.month_name[1:] # GH 11128 dti = DatetimeIndex(freq='D', start=datetime(1998, 1, 1), periods=365) english_days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'] for day, name, eng_name in zip(range(4, 11), expected_days, english_days): name = name.capitalize() assert dti.weekday_name[day] == eng_name assert dti.day_name(locale=time_locale)[day] == name ts = Timestamp(datetime(2016, 4, day)) with tm.assert_produces_warning(FutureWarning, check_stacklevel=False): assert ts.weekday_name == eng_name assert ts.day_name(locale=time_locale) == name dti = dti.append(DatetimeIndex([pd.NaT])) assert np.isnan(dti.day_name(locale=time_locale)[-1]) ts = Timestamp(pd.NaT) assert np.isnan(ts.day_name(locale=time_locale)) # GH 12805 dti = DatetimeIndex(freq='M', start='2012', end='2013') result = dti.month_name(locale=time_locale) expected = Index([month.capitalize() for month in expected_months]) > tm.assert_index_equal(result, expected) ../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/indexes/datetimes/test_misc.py:287: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/util/testing.py:855: in assert_index_equal raise_assert_detail(obj, msg, left, right) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ obj = 'Index', message = 'Index values are different (16.66667 %)' left = Index(['Yanvar', 'Fevral', 'Mart', 'Aprel', 'May\u0131s', 'I\u0307yun', 'I\u0307yul',\n 'Avgust', 'Sent\xe2br', 'Okt\xe2br', 'Noyabr', 'Dekabr'],\n dtype='object') right = Index(['Yanvar', 'Fevral', 'Mart', 'Aprel', 'May\u0131s', '\u0130yun', '\u0130yul', 'Avgust',\n 'Sent\xe2br', 'Okt\xe2br', 'Noyabr', 'Dekabr'],\n dtype='object') diff = None def raise_assert_detail(obj, message, left, right, diff=None): if isinstance(left, np.ndarray): left = pprint_thing(left) elif is_categorical_dtype(left): left = repr(left) if PY2 and isinstance(left, string_types): # left needs to be printable in native text type in python2 left = left.encode('utf-8') if isinstance(right, np.ndarray): right = pprint_thing(right) elif is_categorical_dtype(right): right = repr(right) if PY2 and isinstance(right, string_types): # right needs to be printable in native text type in python2 right = right.encode('utf-8') msg = """{obj} are different {message} [left]: {left} [right]: {right}""".format(obj=obj, message=message, left=left, right=right) if diff is not None: msg += "\n[diff]: {diff}".format(diff=diff) > raise AssertionError(msg) E AssertionError: Index are different E E Index values are different (16.66667 %) E [left]: Index(['Yanvar', 'Fevral', 'Mart', 'Aprel', 'May\u0131s', 'I\u0307yun', 'I\u0307yul', E 'Avgust', 'Sent�br', 'Okt�br', 'Noyabr', 'Dekabr'], E dtype='object') E [right]: Index(['Yanvar', 'Fevral', 'Mart', 'Aprel', 'May\u0131s', '\u0130yun', '\u0130yul', 'Avgust', E 'Sent�br', 'Okt�br', 'Noyabr', 'Dekabr'], E dtype='object') ../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/util/testing.py:1035: AssertionError TestSeriesDatetimeValues.test_dt_accessor_datetime_name_accessors[crh_UA.UTF-80] self = time_locale = 'crh_UA.UTF-8' @pytest.mark.parametrize('time_locale', [ None] if tm.get_locales() is None else [None] + tm.get_locales()) def test_dt_accessor_datetime_name_accessors(self, time_locale): # Test Monday -> Sunday and January -> December, in that sequence if time_locale is None: # If the time_locale is None, day-name and month_name should # return the english attributes expected_days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'] expected_months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December'] else: with tm.set_locale(time_locale, locale.LC_TIME): expected_days = calendar.day_name[:] expected_months = calendar.month_name[1:] s = Series(DatetimeIndex(freq='D', start=datetime(1998, 1, 1), periods=365)) english_days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'] for day, name, eng_name in zip(range(4, 11), expected_days, english_days): name = name.capitalize() assert s.dt.weekday_name[day] == eng_name assert s.dt.day_name(locale=time_locale)[day] == name s = s.append(Series([pd.NaT])) assert np.isnan(s.dt.day_name(locale=time_locale).iloc[-1]) s = Series(DatetimeIndex(freq='M', start='2012', end='2013')) result = s.dt.month_name(locale=time_locale) expected = Series([month.capitalize() for month in expected_months]) > tm.assert_series_equal(result, expected) ../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/series/test_datetime_values.py:312: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/util/testing.py:1244: in assert_series_equal obj='{obj}'.format(obj=obj)) pandas/_libs/testing.pyx:59: in pandas._libs.testing.assert_almost_equal ??? pandas/_libs/testing.pyx:173: in pandas._libs.testing.assert_almost_equal ??? _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ obj = 'Series', message = 'Series values are different (16.66667 %)' left = '[Yanvar, Fevral, Mart, Aprel, May\u0131s, I\u0307yun, I\u0307yul, Avgust, Sent\xe2br, Okt\xe2br, Noyabr, Dekabr]' right = '[Yanvar, Fevral, Mart, Aprel, May\u0131s, \u0130yun, \u0130yul, Avgust, Sent\xe2br, Okt\xe2br, Noyabr, Dekabr]' diff = None def raise_assert_detail(obj, message, left, right, diff=None): if isinstance(left, np.ndarray): left = pprint_thing(left) elif is_categorical_dtype(left): left = repr(left) if PY2 and isinstance(left, string_types): # left needs to be printable in native text type in python2 left = left.encode('utf-8') if isinstance(right, np.ndarray): right = pprint_thing(right) elif is_categorical_dtype(right): right = repr(right) if PY2 and isinstance(right, string_types): # right needs to be printable in native text type in python2 right = right.encode('utf-8') msg = """{obj} are different {message} [left]: {left} [right]: {right}""".format(obj=obj, message=message, left=left, right=right) if diff is not None: msg += "\n[diff]: {diff}".format(diff=diff) > raise AssertionError(msg) E AssertionError: Series are different E E Series values are different (16.66667 %) E [left]: [Yanvar, Fevral, Mart, Aprel, May\u0131s, I\u0307yun, I\u0307yul, Avgust, Sent\xe2br, Okt\xe2br, Noyabr, Dekabr] E [right]: [Yanvar, Fevral, Mart, Aprel, May\u0131s, \u0130yun, \u0130yul, Avgust, Sent\xe2br, Okt\xe2br, Noyabr, Dekabr] ../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/util/testing.py:1035: AssertionError TestSeriesDatetimeValues.test_dt_accessor_datetime_name_accessors[crh_UA.UTF-81] self = time_locale = 'crh_UA.UTF-8' @pytest.mark.parametrize('time_locale', [ None] if tm.get_locales() is None else [None] + tm.get_locales()) def test_dt_accessor_datetime_name_accessors(self, time_locale): # Test Monday -> Sunday and January -> December, in that sequence if time_locale is None: # If the time_locale is None, day-name and month_name should # return the english attributes expected_days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'] expected_months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December'] else: with tm.set_locale(time_locale, locale.LC_TIME): expected_days = calendar.day_name[:] expected_months = calendar.month_name[1:] s = Series(DatetimeIndex(freq='D', start=datetime(1998, 1, 1), periods=365)) english_days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'] for day, name, eng_name in zip(range(4, 11), expected_days, english_days): name = name.capitalize() assert s.dt.weekday_name[day] == eng_name assert s.dt.day_name(locale=time_locale)[day] == name s = s.append(Series([pd.NaT])) assert np.isnan(s.dt.day_name(locale=time_locale).iloc[-1]) s = Series(DatetimeIndex(freq='M', start='2012', end='2013')) result = s.dt.month_name(locale=time_locale) expected = Series([month.capitalize() for month in expected_months]) > tm.assert_series_equal(result, expected) ../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/tests/series/test_datetime_values.py:312: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/util/testing.py:1244: in assert_series_equal obj='{obj}'.format(obj=obj)) pandas/_libs/testing.pyx:59: in pandas._libs.testing.assert_almost_equal ??? pandas/_libs/testing.pyx:173: in pandas._libs.testing.assert_almost_equal ??? _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ obj = 'Series', message = 'Series values are different (16.66667 %)' left = '[Yanvar, Fevral, Mart, Aprel, May\u0131s, I\u0307yun, I\u0307yul, Avgust, Sent\xe2br, Okt\xe2br, Noyabr, Dekabr]' right = '[Yanvar, Fevral, Mart, Aprel, May\u0131s, \u0130yun, \u0130yul, Avgust, Sent\xe2br, Okt\xe2br, Noyabr, Dekabr]' diff = None def raise_assert_detail(obj, message, left, right, diff=None): if isinstance(left, np.ndarray): left = pprint_thing(left) elif is_categorical_dtype(left): left = repr(left) if PY2 and isinstance(left, string_types): # left needs to be printable in native text type in python2 left = left.encode('utf-8') if isinstance(right, np.ndarray): right = pprint_thing(right) elif is_categorical_dtype(right): right = repr(right) if PY2 and isinstance(right, string_types): # right needs to be printable in native text type in python2 right = right.encode('utf-8') msg = """{obj} are different {message} [left]: {left} [right]: {right}""".format(obj=obj, message=message, left=left, right=right) if diff is not None: msg += "\n[diff]: {diff}".format(diff=diff) > raise AssertionError(msg) E AssertionError: Series are different E E Series values are different (16.66667 %) E [left]: [Yanvar, Fevral, Mart, Aprel, May\u0131s, I\u0307yun, I\u0307yul, Avgust, Sent\xe2br, Okt\xe2br, Noyabr, Dekabr] E [right]: [Yanvar, Fevral, Mart, Aprel, May\u0131s, \u0130yun, \u0130yul, Avgust, Sent\xe2br, Okt\xe2br, Noyabr, Dekabr] ../build/lib/python3.6/site-packages/pandas-0.23.4-py3.6-linux-x86_64.egg/pandas/util/testing.py:1035: AssertionError ```
WillAyd commented 5 years ago

Do you get the same errors on master with your normal locale? There was a change focused on fixing locale testing issues which gets released in 0.24 though I think it was only focused on Py27 (#22213)

bodgerer commented 5 years ago

I get the same utf-8 unicode decode errors against both 0.23.4 and master. They all go away if I change the LANG environment variable from the default en_GB.utf8 to en_GB.

Once LANG has been tweaked, unit tests are able to completion. 0.23.4 has 4 failures (clearly to do with locales), master has none.

TomAugspurger commented 5 years ago

I'm not sure what's going wrong here then. Let us know if you're able to debug it.

On Tue, Nov 13, 2018 at 9:58 AM Mark Dixon notifications@github.com wrote:

I get the same utf-8 unicode decode errors against both 0.23.4 and master. They all go away if I change the LANG environment variable from the default en_GB.utf8 to en_GB.

Once LANG has been tweaked, unit tests are able to completion. 0.23.4 has 4 failures (clearly to do with locales), master has none.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pandas-dev/pandas/issues/23638#issuecomment-438318783, or mute the thread https://github.com/notifications/unsubscribe-auth/ABQHIiIMsM0i9p_sKD-72XVTnnGOAfppks5uuuwJgaJpZM4YZ_Df .

TomAugspurger commented 5 years ago

@bodgerer any luck?

TomAugspurger commented 5 years ago

@bodgerer were you able to find out anything further?

joshuamhtsang commented 5 years ago

@bodgerer This might not be what you're looking for, but you can just ignore testing certain vendor directories:

$ pytest --ignore=../build/lib

At least now you can perform testing on the code you write yourself.

tddpirate commented 5 years ago

I installed pandas from Anaconda3-2018.12-Linux-x86_64.sh

import pandas as pd
pd.test()

yielded the following output in a Jupyter notebook:

running: pytest --skip-slow --skip-network /home/omer/anaconda3/lib/python3.7/site-packages/pandas
========================================================== test session starts ===========================================================
platform linux -- Python 3.7.1, pytest-4.0.2, py-1.7.0, pluggy-0.8.0
rootdir: /home/omer, inifile:
plugins: remotedata-0.3.1, openfiles-0.3.1, doctestplus-0.2.0, arraydiff-0.3
collected 26050 items / 3 errors / 2 skipped

================================================================= ERRORS =================================================================
_____________________ ERROR collecting anaconda3/lib/python3.7/site-packages/pandas/tests/groupby/test_whitelist.py ______________________
../../../anaconda3/lib/python3.7/site-packages/pandas/tests/groupby/test_whitelist.py:127: in <module>
    "obj, whitelist", zip((df_letters(), df_letters().floats),
E   _pytest.warning_types.RemovedInPytest4Warning: Fixture "df_letters" called directly. Fixtures are not meant to be called directly, are created automatically when test functions request them as parameters. See https://docs.pytest.org/en/latest/fixture.html for more information.
__________________ ERROR collecting anaconda3/lib/python3.7/site-packages/pandas/tests/indexes/datetimes/test_tools.py ___________________
../../../anaconda3/lib/python3.7/site-packages/pandas/tests/indexes/datetimes/test_tools.py:1494: in <module>
    @pytest.fixture(params=[epoch_1960(),
E   _pytest.warning_types.RemovedInPytest4Warning: Fixture "epoch_1960" called directly. Fixtures are not meant to be called directly, are created automatically when test functions request them as parameters. See https://docs.pytest.org/en/latest/fixture.html for more information.
______________________ ERROR collecting anaconda3/lib/python3.7/site-packages/pandas/tests/series/test_analytics.py ______________________
../../../anaconda3/lib/python3.7/site-packages/pandas/tests/series/test_analytics.py:1882: in <module>
    class TestNLargestNSmallest(object):
../../../anaconda3/lib/python3.7/site-packages/pandas/tests/series/test_analytics.py:1904: in TestNLargestNSmallest
    [v for k, v in s_main_dtypes().iteritems()])
E   _pytest.warning_types.RemovedInPytest4Warning: Fixture "s_main_dtypes" called directly. Fixtures are not meant to be called directly, are created automatically when test functions request them as parameters. See https://docs.pytest.org/en/latest/fixture.html for more information.
============================================================ warnings summary ============================================================
/home/omer/anaconda3/lib/python3.7/site-packages/_pytest/config/__init__.py:754
  /home/omer/anaconda3/lib/python3.7/site-packages/_pytest/config/__init__.py:754: PytestWarning: Module already imported so cannot be rewritten: pytest_remotedata
    self._mark_plugins_for_rewrite(hook)
  /home/omer/anaconda3/lib/python3.7/site-packages/_pytest/config/__init__.py:754: PytestWarning: Module already imported so cannot be rewritten: pytest_openfiles
    self._mark_plugins_for_rewrite(hook)
  /home/omer/anaconda3/lib/python3.7/site-packages/_pytest/config/__init__.py:754: PytestWarning: Module already imported so cannot be rewritten: pytest_doctestplus
    self._mark_plugins_for_rewrite(hook)
  /home/omer/anaconda3/lib/python3.7/site-packages/_pytest/config/__init__.py:754: PytestWarning: Module already imported so cannot be rewritten: pytest_arraydiff
    self._mark_plugins_for_rewrite(hook)

-- Docs: https://docs.pytest.org/en/latest/warnings.html
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 3 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
============================================ 2 skipped, 4 warnings, 3 error in 13.51 seconds =============================================
An exception has occurred, use %tb to see the full traceback.

SystemExit: 2

/home/omer/anaconda3/lib/python3.7/site-packages/IPython/core/interactiveshell.py:3275: UserWarning: To exit: use 'exit', 'quit', or Ctrl-D.
  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)

The output of pd.show_versions() is as follows:

INSTALLED VERSIONS
------------------
commit: None
python: 3.7.1.final.0
python-bits: 64
OS: Linux
OS-release: 4.16.0-0.bpo.2-amd64
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.23.4
pytest: 4.0.2
pip: 18.1
setuptools: 40.6.3
Cython: 0.29.2
numpy: 1.15.4
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 7.2.0
sphinx: 1.8.2
patsy: 0.5.1
dateutil: 2.7.5
pytz: 2018.7
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.8
feather: None
matplotlib: 3.0.2
openpyxl: 2.5.12
xlrd: 1.2.0
xlwt: 1.3.0
xlsxwriter: 1.1.2
lxml: 4.2.5
bs4: 4.6.3
html5lib: 1.0.1
sqlalchemy: 1.2.15
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
Patolord commented 5 years ago

same issue, did you solved it?

TomAugspurger commented 5 years ago

Those collection errors are fixed on master. You can wait until 0.24 is released (release candidate this week) or downgrade your pytest.

koxt commented 5 years ago

Is it correct behavior ?

Tried py2 (failed tests) and py3 (tests not started) On both installed:

pandas==0.24.0rc1 
    fixed pytest #4545: Calling fixtures directly is now always an error instead of a warning.
    https://github.com/pytest-dev/pytest/issues/4545
pytest== 4.1.1
hypothesis>=3.58 # requested pandas.test() call

Python 2.7.5 pandas.test() finished with 2 failures

============ 2 failed, 43723 passed, 5003 skipped, 752 xfailed, 16 xpassed, 9 warnings in 933.01 seconds ============

FAILURES:
1) TestConfig.test_deprecate_option 
self = <pandas.tests.test_config.TestConfig object at 0x7f3eed216190>

    def test_deprecate_option(self):
...
E           assert 2 == 1
E            +  where 2 = len([<warnings.WarningMessage object at 0x7f3eed2163d0>, <warnings.WarningMessage object at 0x7f3eed216450>])

venv/lib/python2.7/site-packages/pandas/tests/test_config.py:255: AssertionError

2) test_bad_quote_char[python-kwargs2-"quotechar" must be string, not int]
test_bad_quote_char[python-kwargs2-"quotechar" must be string, not int]

        with pytest.raises(TypeError, match=msg):
>           parser.read_csv(StringIO(data), **kwargs)
E           AssertionError: Pattern '"quotechar" must be string, not int' not found in '"quotechar" must be an 1-character string'

venv/lib/python2.7/site-packages/pandas/tests/io/parser/test_quoting.py:30: AssertionError

Python 3.6.6 pandas.test() halted interpreter

====================================================== ERRORS =======================================================
___________ ERROR collecting venv/lib/python3.6/site-packages/pandas/tests/indexes/datetimes/test_misc.py ___________
venv/lib/python3.6/site-packages/pandas/tests/indexes/datetimes/test_misc.py:91: in <module>
    class TestDatetime64(object):
venv/lib/python3.6/site-packages/pandas/tests/indexes/datetimes/test_misc.py:246: in TestDatetime64
    None] if tm.get_locales() is None else [None] + tm.get_locales())
venv/lib/python3.6/site-packages/pandas/util/testing.py:516: in get_locales
    x, encoding=pd.options.display.encoding))
E   UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 4: invalid continuation byte
_________ ERROR collecting venv/lib/python3.6/site-packages/pandas/tests/scalar/timestamp/test_timestamp.py _________
venv/lib/python3.6/site-packages/pandas/tests/scalar/timestamp/test_timestamp.py:28: in <module>
    class TestTimestampProperties(object):
venv/lib/python3.6/site-packages/pandas/tests/scalar/timestamp/test_timestamp.py:104: in TestTimestampProperties
    None] if tm.get_locales() is None else [None] + tm.get_locales())
venv/lib/python3.6/site-packages/pandas/util/testing.py:516: in get_locales
    x, encoding=pd.options.display.encoding))
E   UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 4: invalid continuation byte
___________ ERROR collecting venv/lib/python3.6/site-packages/pandas/tests/series/test_datetime_values.py ___________
venv/lib/python3.6/site-packages/pandas/tests/series/test_datetime_values.py:27: in <module>
    class TestSeriesDatetimeValues():
venv/lib/python3.6/site-packages/pandas/tests/series/test_datetime_values.py:322: in TestSeriesDatetimeValues
    None] if tm.get_locales() is None else [None] + tm.get_locales())
venv/lib/python3.6/site-packages/pandas/util/testing.py:516: in get_locales
    x, encoding=pd.options.display.encoding))
E   UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 4: invalid continuation byte
________________ ERROR collecting venv/lib/python3.6/site-packages/pandas/tests/util/test_locale.py _________________
venv/lib/python3.6/site-packages/pandas/tests/util/test_locale.py:13: in <module>
    _all_locales = tm.get_locales() or []
venv/lib/python3.6/site-packages/pandas/util/testing.py:516: in get_locales
    x, encoding=pd.options.display.encoding))
E   UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe5 in position 4: invalid continuation byte
================================================= warnings summary ==================================================
venv/lib/python3.6/site-packages/_pytest/config/__init__.py:730
  /home/koxt/dev/py/pd3/venv/lib/python3.6/site-packages/_pytest/config/__init__.py:730: PytestWarning: Module already imported so cannot be rewritten: hypothesis
    self._mark_plugins_for_rewrite(hook)

-- Docs: https://docs.pytest.org/en/latest/warnings.html
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 4 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
================================== 7 skipped, 1 warnings, 4 error in 36.71 seconds ==================================
TomAugspurger commented 5 years ago

We've seen that locale issue elsewhere. Not sure if it's fixed on master.

Can you debug the other failures? What's the other warning that's being raised?

WillAyd commented 5 years ago

Original issue here is resolved. Other issues are orthogonal or even no longer supported (ex: Py2 issues) so closing as is. If anyone has anything else please open as a new issue