Scheduled biweekly dependency update for week 02

Updates

Here's a list of all the updates bundled in this pull request. I've added some links to make it easier for you to find all the information you need.

astropy	2.0.2	»	2.0.3	PyPI \| Changelog \| Homepage
matplotlib	2.0.2	»	2.1.1	PyPI \| Changelog \| Homepage
jsmin	2.2.2	»	2.2.2	PyPI \| Changelog \| Repo
numba	0.35.0	»	0.36.2	PyPI \| Changelog \| Repo
numpy	1.13.3	»	1.14.0	PyPI \| Changelog \| Homepage
pandas	0.20.3	»	0.22.0	PyPI \| Changelog \| Homepage
scipy	0.19.1	»	1.0.0	PyPI \| Changelog \| Repo \| Homepage
spectrum-overload	0.2.1	»	0.2.1	PyPI \| Repo
sqlalchemy	1.2.1	»	1.2.1	PyPI \| Changelog \| Homepage
tqdm	4.17.1	»	4.19.5	PyPI \| Changelog \| Repo
pyyaml	3.12	»	3.12	PyPI \| Changelog \| Homepage
codeclimate-test-reporter	0.2.3	»	0.2.3	PyPI \| Repo
codacy-coverage	1.3.10	»	1.3.10	PyPI \| Changelog \| Repo
hypothesis	3.44.16	»	3.44.16	PyPI \| Changelog \| Repo
pytest	3.3.2	»	3.3.2	PyPI \| Changelog \| Repo \| Homepage
pytest-cov	2.5.1	»	2.5.1	PyPI \| Changelog \| Repo
python-coveralls	2.9.1	»	2.9.1	PyPI \| Repo

Changelogs

astropy 2.0.2 -> 2.0.3

2.0.3

==================

Bug Fixes

astropy.coordinates ^^^^^^^^^^^^^^^^^^^

Ecliptic frame classes now support attributes v_x, v_y, v_z when using with a Cartesian representation. [6569]

Added a nicer error message when accidentally calling frame.representation instead of frame.data in the context of methods that use ._apply(). [6561]

Creating a new SkyCoord from a list of multiple SkyCoord objects now yield the correct type of frame, and works at all for non-equatorial frames. [6612]

Improved accuracy of velocity calculation in EarthLocation.get_gcrs_posvel. [6699]

Improved accuracy of radial velocity corrections in ``SkyCoord.radial_velocity_correction```. [6861]

The precision of ecliptic frames is now much better, after removing the nutation from the rotation and fixing the computation of the position of the Sun. [6508]

astropy.extern ^^^^^^^^^^^^^^

Version 0.2.1 of pytest-astropy is included as an external package. [6918]

astropy.io.fits ^^^^^^^^^^^^^^^

Fix writing the result of fitsdiff to file with --output-file. [6621]

Fix a minor bug where FITS_rec instances can not be indexed with tuples and other sequences that end up with a scalar. [6955, 6966]

astropy.io.misc ^^^^^^^^^^^^^^^

Fix ImportError when hdf5 is imported first in a fresh Python interpreter in Python 3. [6604, 6610]

astropy.nddata ^^^^^^^^^^^^^^

Suppress errors during WCS creation in CCDData.read(). [6500]

Fixed a problem with CCDData.read when the extension wasn't given and the primary HDU contained no data but another HDU did. In that case the header were not correctly combined. [6489]

astropy.stats ^^^^^^^^^^^^^

Fixed an issue where the biweight statistics functions would sometimes cause runtime underflow/overflow errors for float32 input arrays. [6905]

astropy.table ^^^^^^^^^^^^^

Fixed a problem when printing a table when a column is deleted and garbage-collected, and the format function caching mechanism happens to re-use the same cache key. [6714]

Fixed a problem when comparing a unicode masked column (on left side) to a bytes masked column (on right side). [6899]

Fixed a problem in comparing masked columns in bytes and unicode when the unicode had masked entries. [6899]

astropy.tests ^^^^^^^^^^^^^

Fixed a bug that causes tests for rst files to not be run on certain platforms. [6555, 6608]

Fixed a bug that caused the doctestplus plugin to not work nicely with the hypothesis package. [6605, 6609]

Fixed a bug that meant that the data.astropy.org mirror could not be used when using --remote-data=astropy. [6724]

Support compatibility with new pytest-astropy plugins. [6918]

When testing, astropy (or the package being tested) is now installed to a temporary directory instead of copying the build. This allows entry points to work correctly. [6890]

astropy.time ^^^^^^^^^^^^

Initialization of Time instances now is consistent for all formats to ensure that -0.5 <= jd2 < 0.5. [6653]

astropy.units ^^^^^^^^^^^^^

Ensure that Quantity slices can be set with objects that have a unit attribute (such as Column). [6123]

astropy.utils ^^^^^^^^^^^^^

download_files_in_parallel now respects the given timeout value. [6658]

Fixed bugs in remote data handling and also in IERS unit test related to path URL, and URI normalization on Windows. [6651]

Fixed a bug that caused get_pkg_data_fileobj to not work correctly when used with non-local data from inside packages. [6724]

Make sure get_pkg_data_fileobj fails if the URL can not be read, and correctly falls back on the mirror if necessary. [6767]

Fix the finddiff option in find_current_module to properly deal with submodules. [6767]

Fixed pyreadline import in utils.console.isatty for older IPython versions on Windows. [6800]

astropy.visualization ^^^^^^^^^^^^^^^^^^^^^

Fixed the vertical orientation of the fits2bitmap output bitmap image to match that of the FITS image. [6844, 6969]

Added a workaround for a bug in matplotlib so that the fits2bitmap script generates the correct output file type. [6969]

Other Changes and Additions

No longer require LaTeX to build the documentation locally and use mathjax instead. [6701]

Fixed broken links in the documentation. [6745]

Ensured that all tests use the Astropy data mirror if needed. [6767]

matplotlib 2.0.2 -> 2.1.1

2.1.1

2.1.0

This is the second minor release in the Matplotlib 2.x series and the first release with major new features since 1.5.

This release contains approximately 2 years worth of work by 275 contributors across over 950 pull requests. Highlights from this release include:

support for string categorical values

export of animations to interactive javascript widgets

major overhaul of polar plots

reproducible output for ps/eps, pdf, and svg backends

performance improvements in drawing lines and images

GUIs show a busy cursor while rendering the plot

along with many other enhancements and bug fixes.

2.1.0rc1

jsmin -> 2.2.2

2.2.2

Add license headers to code files (fixes i17)

Remove mercurial files (fixes 20)

2.2.1

Fix 14: Infinite loop on return x / 1;

2.2.0

Merge 13: Preserve "loud comments" starting with /*!

These are commonly used for copyright notices, and are preserved by various other minifiers (e.g. YUI Compressor).

2.1.6

Fix 12: Newline following a regex literal should not be elided.

2.1.5

Fix 9: Premature end of statement caused by multi-line comment not adding newline.

Fix 10: Removing multiline comment separating tokens must leave a space.

Refactor comment handling for maintainability.

2.1.4

Fix 6: regex literal matching comment was not correctly matched.

Refactor regex literal handling for robustness.

2.1.3

Reset issue numbering: issues live in github from now on.

Fix 1: regex literal was not recognised when occurring directly after {.

2.1.2

Issue numbers here and below refer to the bitbucket repository.

Fix 17: bug when JS starts with comment then literal regex.

2.1.1

Fix 16: bug returning a literal regex containing escaped forward-slashes.

2.1.0

First changelog entries; see README.rst for prior contributors.

Expose quote_chars parameter to provide just enough unofficial Harmony support to be useful.

numba 0.35.0 -> 0.36.2

0.36.2

This is a bugfix release that provides minor changes to address:

PR 2645: Avoid CPython bug with exec in older 2.7.x.

PR 2652: Add support for CUDA 9.

0.36.1

This release continues to add new features to the work undertaken in partnership with Intel on ParallelAccelerator technology. Other changes of note include the compilation chain being updated to use LLVM 5.0 and the production of conda packages using conda-build 3 and the new compilers that ship with it.

NOTE: A version 0.36.0 was tagged for internal use but not released.

ParallelAccelerator:

NOTE: The ParallelAccelerator technology is under active development and should be considered experimental.

New features relating to ParallelAccelerator, from work undertaken with Intel, include the addition of the stencil decorator for ease of implementation of stencil-like computations, support for general reductions, and slice and range fusion for parallel slice/bit-array assignments. Documentation on both the use and implementation of the above has been added. Further, a new debug environment variable NUMBA_DEBUG_ARRAY_OPT_STATS is made available to give information about which operators/calls are converted to parallel for-loops.

ParallelAccelerator features:

PR 2457: Stencil Computations in ParallelAccelerator

PR 2548: Slice and range fusion, parallelizing bitarray and slice assignment

PR 2516: Support general reductions in ParallelAccelerator

ParallelAccelerator fixes:

PR 2540: Fix bug 2537

PR 2566: Fix issue 2564.

PR 2599: Fix nested multi-dimensional parfor type inference issue

PR 2604: Fixes for stencil tests and cmath sin().

PR 2605: Fixes issue 2603.

Additional features of note:

This release of Numba (and llvmlite) is updated to use LLVM version 5.0 as the compiler back end, the main change to Numba to support this was the addition of a custom symbol tracker to avoid the calls to LLVM's ExecutionEngine that was crashing when asking for non-existent symbol addresses. Further, the conda packages for this release of Numba are built using conda build version 3 and the new compilers/recipe grammar that are present in that release.

PR 2568: Update for LLVM 5

PR 2607: Fixes abort when getting address to "nrt_unresolved_abort"

PR 2615: Working towards conda build 3

Thanks to community feedback and bug reports, the following fixes were also made.

Misc fixes/enhancements:

PR 2534: Add tuple support to np.take.

PR 2551: Rebranding fix

PR 2552: relative doc links

PR 2570: Fix issue 2561, handle missing successor on loop exit

PR 2588: Fix 2555. Disable libpython.so linking on linux

PR 2601: Update llvmlite version dependency.

PR 2608: Fix potential cache file collision

PR 2612: Fix NRT test failure due to increased overhead when running in coverage

PR 2619: Fix dubious pthread_cond_signal not in lock

PR 2622: Fix np.nanmedian for all NaN case.

PR 2633: Fix markdown in CONTRIBUTING.md

PR 2635: Make the dependency on compilers for AOT optional.

CUDA support fixes:

PR 2523: Fix invalid cuda context in memory transfer calls in another thread

PR 2575: Use CPU to initialize xoroshiro states for GPU RNG. Fixes 2573

PR 2581: Fix cuda gufunc mishandling of scalar arg as array and out argument

numpy 1.13.3 -> 1.14.0

1.14.0

==========================

Numpy 1.14.0 is the result of seven months of work and contains a large number of bug fixes and new features, along with several changes with potential compatibility issues. The major change that users will notice are the stylistic changes in the way numpy arrays and scalars are printed, a change that will affect doctests. See below for details on how to preserve the old style printing when needed.

A major decision affecting future development concerns the schedule for dropping Python 2.7 support in the runup to 2020. The decision has been made to support 2.7 for all releases made in 2018, with the last release being designated a long term release with support for bug fixes extending through

In 2019 support for 2.7 will be dropped in all new releases. More details can be found in the relevant NEP_.

This release supports Python 2.7 and 3.4 - 3.6.

.. _NEP: https://github.com/numpy/numpy/blob/master/doc/neps/dropping-python2.7-proposal.rst

Highlights

The np.einsum function uses BLAS when possible

genfromtxt, loadtxt, fromregex and savetxt can now handle files with arbitrary Python supported encoding.

Major improvements to printing of NumPy arrays and scalars.

New functions

parametrize: decorator added to numpy.testing

chebinterpolate: Interpolate function at Chebyshev points.

format_float_positional and format_float_scientific : format floating-point scalars unambiguously with control of rounding and padding.

PyArray_ResolveWritebackIfCopy and PyArray_SetWritebackIfCopyBase, new C-API functions useful in achieving PyPy compatibity.

Deprecations

Using np.bool_ objects in place of integers is deprecated. Previously operator.index(np.bool_) was legal and allowed constructs such as [1, 2, 3][np.True_]. That was misleading, as it behaved differently from np.array([1, 2, 3])[np.True_].

Truth testing of an empty array is deprecated. To check if an array is not empty, use array.size > 0.

Calling np.bincount with minlength=None is deprecated. minlength=0 should be used instead.

Calling np.fromstring with the default value of the sep argument is deprecated. When that argument is not provided, a broken version of np.frombuffer is used that silently accepts unicode strings and -- after encoding them as either utf-8 (python 3) or the default encoding (python 2) -- treats them as binary data. If reading binary data is desired, np.frombuffer should be used directly.

The style option of array2string is deprecated in non-legacy printing mode.

PyArray_SetUpdateIfCopyBase has been deprecated. For NumPy versions >= 1.14 use PyArray_SetWritebackIfCopyBase instead, see C API changes below for more details.

The use of UPDATEIFCOPY arrays is deprecated, see C API changes below for details. We will not be dropping support for those arrays, but they are not compatible with PyPy.

Future Changes

np.issubdtype will stop downcasting dtype-like arguments. It might be expected that issubdtype(np.float32, 'float64') and issubdtype(np.float32, np.float64) mean the same thing - however, there was an undocumented special case that translated the former into issubdtype(np.float32, np.floating), giving the surprising result of True.

This translation now gives a warning that explains what translation is occurring. In the future, the translation will be disabled, and the first example will be made equivalent to the second.

np.linalg.lstsq default for rcond will be changed. The rcond parameter to np.linalg.lstsq will change its default to machine precision times the largest of the input array dimensions. A FutureWarning is issued when rcond is not passed explicitly.

a.flat.__array__() will return a writeable copy of a when a is non-contiguous. Previously it returned an UPDATEIFCOPY array when a was writeable. Currently it returns a non-writeable copy. See gh-7054 for a discussion of the issue.

Unstructured void array's .item method will return a bytes object. In the future, calling .item() on arrays or scalars of np.void datatype will return a bytes object instead of a buffer or int array, the same as returned by bytes(void_scalar). This may affect code which assumed the return value was mutable, which will no longer be the case. A FutureWarning is now issued when this would occur.

Compatibility notes

The mask of a masked array view is also a view rather than a copy

There was a FutureWarning about this change in NumPy 1.11.x. In short, it is now the case that, when changing a view of a masked array, changes to the mask are propagated to the original. That was not previously the case. This change affects slices in particular. Note that this does not yet work properly if the mask of the original array is nomask and the mask of the view is changed. See gh-5580 for an extended discussion. The original behavior of having a copy of the mask can be obtained by calling the unshare_mask method of the view.

np.ma.masked is no longer writeable

Attempts to mutate the masked constant now error, as the underlying arrays are marked readonly. In the past, it was possible to get away with::
emulating a function that sometimes returns np.ma.masked
val = random.choice([np.ma.masked, 10]) var_arr = np.asarray(val) val_arr += 1 now errors, previously changed np.ma.masked.data

np.ma functions producing fill_values have changed

Previously, np.ma.default_fill_value would return a 0d array, but np.ma.minimum_fill_value and np.ma.maximum_fill_value would return a tuple of the fields. Instead, all three methods return a structured np.void object, which is what you would already find in the .fill_value attribute.

Additionally, the dtype guessing now matches that of np.array - so when passing a python scalar x, maximum_fill_value(x) is always the same as maximum_fill_value(np.array(x)). Previously x = long(1) on Python 2 violated this assumption.

a.flat.__array__() returns non-writeable arrays when a is non-contiguous

The intent is that the UPDATEIFCOPY array previously returned when a was non-contiguous will be replaced by a writeable copy in the future. This temporary measure is aimed to notify folks who expect the underlying array be modified in this situation that that will no longer be the case. The most likely places for this to be noticed is when expressions of the form np.asarray(a.flat) are used, or when a.flat is passed as the out parameter to a ufunc.

np.tensordot now returns zero array when contracting over 0-length dimension

Previously np.tensordot raised a ValueError when contracting over 0-length dimension. Now it returns a zero array, which is consistent with the behaviour of np.dot and np.einsum.

numpy.testing reorganized

This is not expected to cause problems, but possibly something has been left out. If you experience an unexpected import problem using numpy.testing let us know.

np.asfarray no longer accepts non-dtypes through the dtype argument

This previously would accept dtype=some_array, with the implied semantics of dtype=some_array.dtype. This was undocumented, unique across the numpy functions, and if used would likely correspond to a typo.

1D np.linalg.norm preserves float input types, even for arbitrary orders

Previously, this would promote to float64 when arbitrary orders were passed, despite not doing so under the simple cases::

>>> f32 = np.float32([1, 2]) >>> np.linalg.norm(f32, 2.0).dtype dtype('float32') >>> np.linalg.norm(f32, 2.0001).dtype dtype('float64') numpy 1.13 dtype('float32') numpy 1.14

This change affects only float32 and float16 arrays.

count_nonzero(arr, axis=()) now counts over no axes, not all axes

Elsewhere, axis==() is always understood as "no axes", but count_nonzero had a special case to treat this as "all axes". This was inconsistent and surprising. The correct way to count over all axes has always been to pass axis == None.

__init__.py files added to test directories

This is for pytest compatibility in the case of duplicate test file names in the different directories. As a result, run_module_suite no longer works, i.e., python <path-to-test-file> results in an error.

.astype(bool) on unstructured void arrays now calls bool on each element

On Python 2, void_array.astype(bool) would always return an array of True, unless the dtype is V0. On Python 3, this operation would usually crash. Going forwards, astype matches the behavior of bool(np.void), considering a buffer of all zeros as false, and anything else as true. Checks for V0 can still be done with arr.dtype.itemsize == 0.

MaskedArray.squeeze never returns np.ma.masked

np.squeeze is documented as returning a view, but the masked variant would sometimes return masked, which is not a view. This has been fixed, so that the result is always a view on the original masked array. This breaks any code that used masked_arr.squeeze() is np.ma.masked, but fixes code that writes to the result of .squeeze().

Renamed first parameter of can_cast from from to from_

The previous parameter name from is a reserved keyword in Python, which made it difficult to pass the argument by name. This has been fixed by renaming the parameter to from_.

isnat raises TypeError when passed wrong type

The ufunc isnat used to raise a ValueError when it was not passed variables of type datetime or timedelta. This has been changed to raising a TypeError.

dtype.__getitem__ raises TypeError when passed wrong type

When indexed with a float, the dtype object used to raise ValueError.

User-defined types now need to implement __str__ and __repr__

Previously, user-defined types could fall back to a default implementation of __str__ and __repr__ implemented in numpy, but this has now been removed. Now user-defined types will fall back to the python default object.__str__ and object.__repr__.

Many changes to array printing, disableable with the new "legacy" printing mode

The str and repr of ndarrays and numpy scalars have been changed in a variety of ways. These changes are likely to break downstream user's doctests.

These new behaviors can be disabled to mostly reproduce numpy 1.13 behavior by enabling the new 1.13 "legacy" printing mode. This is enabled by calling np.set_printoptions(legacy="1.13"), or using the new legacy argument to np.array2string, as np.array2string(arr, legacy='1.13').

In summary, the major changes are:

For floating-point types:

The repr of float arrays often omits a space previously printed in the sign position. See the new sign option to np.set_printoptions.

Floating-point arrays and scalars use a new algorithm for decimal representations, giving the shortest unique representation. This will usually shorten float16 fractional output, and sometimes float32 and float128 output. float64 should be unaffected. See the new floatmode option to np.set_printoptions.

Float arrays printed in scientific notation no longer use fixed-precision, and now instead show the shortest unique representation.

The str of floating-point scalars is no longer truncated in python2.

For other data types:

Non-finite complex scalars print like nanj instead of nan*j.

NaT values in datetime arrays are now properly aligned.

Arrays and scalars of np.void datatype are now printed using hex notation.

For line-wrapping:

The "dtype" part of ndarray reprs will now be printed on the next line if there isn't space on the last line of array output.

The linewidth format option is now always respected. The repr or str of an array will never exceed this, unless a single element is too wide.

The last line of an array string will never have more elements than earlier lines.

An extra space is no longer inserted on the first line if the elements are too wide.

For summarization (the use of ... to shorten long arrays):

A trailing comma is no longer inserted for str. Previously, str(np.arange(1001)) gave '[ 0 1 2 ..., 998 999 1000]', which has an extra comma.

For arrays of 2-D and beyond, when ... is printed on its own line in order to summarize any but the last axis, newlines are now appended to that line to match its leading newlines and a trailing space character is removed.

MaskedArray arrays now separate printed elements with commas, always print the dtype, and correctly wrap the elements of long arrays to multiple lines. If there is more than 1 dimension, the array attributes are now printed in a new "left-justified" printing style.

recarray arrays no longer print a trailing space before their dtype, and wrap to the right number of columns.

0d arrays no longer have their own idiosyncratic implementations of str and repr. The style argument to np.array2string is deprecated.

Arrays of bool datatype will omit the datatype in the repr.

User-defined dtypes (subclasses of np.generic) now need to implement __str__ and __repr__.

You may want to do something like::
FIXME: Set numpy array str/repr to legacy behaviour on numpy &gt; 1.13
try: np.set_printoptions(legacy='1.13') except TypeError: pass

after ::

import numpy as np

Some of these changes are described in more detail below.

C API changes

PyPy compatible alternative to UPDATEIFCOPY arrays

UPDATEIFCOPY arrays are contiguous copies of existing arrays, possibly with different dimensions, whose contents are copied back to the original array when their refcount goes to zero and they are deallocated. Because PyPy does not use refcounts, they do not function correctly with PyPy. NumPy is in the process of eliminating their use internally and two new C-API functions,

PyArray_SetWritebackIfCopyBase

PyArray_ResolveWritebackIfCopy,

have been added together with a complimentary flag, NPY_ARRAY_WRITEBACKIFCOPY. Using the new functionality also requires that some flags be changed when new arrays are created, to wit: NPY_ARRAY_INOUT_ARRAY should be replaced by NPY_ARRAY_INOUT_ARRAY2 and NPY_ARRAY_INOUT_FARRAY should be replaced by NPY_ARRAY_INOUT_FARRAY2. Arrays created with these new flags will then have the WRITEBACKIFCOPY semantics.

If PyPy compatibility is not a concern, these new functions can be ignored, although there will be a DeprecationWarning. If you do wish to pursue PyPy compatibility, more information on these functions and their use may be found in the c-api documentation and the example in how-to-extend.

.. _c-api: https://github.com/numpy/numpy/blob/master/doc/source/reference/c-api.array.rst .. _how-to-extend: https://github.com/numpy/numpy/blob/master/doc/source/user/c-info.how-to-extend.rst

New Features

Encoding argument for text IO functions

genfromtxt, loadtxt, fromregex and savetxt can now handle files with arbitrary encoding supported by Python via the encoding argument. For backward compatibility the argument defaults to the special bytes value which continues to treat text as raw byte values and continues to pass latin1 encoded bytes to custom converters. Using any other value (including None for system default) will switch the functions to real text IO so one receives unicode strings instead of bytes in the resulting arrays.

External nose plugins are usable by numpy.testing.Tester

numpy.testing.Tester is now aware of nose plugins that are outside the nose built-in ones. This allows using, for example, nose-timer like so: np.test(extra_argv=['--with-timer', '--timer-top-n', '20']) to obtain the runtime of the 20 slowest tests. An extra keyword timer was also added to Tester.test, so np.test(timer=20) will also report the 20 slowest tests.

parametrize decorator added to numpy.testing

A basic parametrize decorator is now available in numpy.testing. It is intended to allow rewriting yield based tests that have been deprecated in pytest so as to facilitate the transition to pytest in the future. The nose testing framework has not been supported for several years and looks like abandonware.

The new parametrize decorator does not have the full functionality of the one in pytest. It doesn't work for classes, doesn't support nesting, and does not substitute variable names. Even so, it should be adequate to rewrite the NumPy tests.

chebinterpolate function added to numpy.polynomial.chebyshev

The new chebinterpolate function interpolates a given function at the Chebyshev points of the first kind. A new Chebyshev.interpolate class method adds support for interpolation over arbitrary intervals using the scaled and shifted Chebyshev points of the first kind.

Support for reading lzma compressed text files in Python 3

With Python versions containing the lzma module the text IO functions can now transparently read from files with xz or lzma extension.

sign option added to np.setprintoptions and np.array2string

This option controls printing of the sign of floating-point types, and may be one of the characters '-', '+' or ' '. With '+' numpy always prints the sign of positive values, with ' ' it always prints a space (whitespace character) in the sign position of positive values, and with '-' it will omit the sign character for positive values. The new default is '-'.

This new default changes the float output relative to numpy 1.13. The old behavior can be obtained in 1.13 "legacy" printing mode, see compatibility notes above.

hermitian option added tonp.linalg.matrix_rank

The new hermitian option allows choosing between standard SVD based matrix rank calculation and the more efficient eigenvalue based method for symmetric/hermitian matrices.

threshold and edgeitems options added to np.array2string

These options could previously be controlled using np.set_printoptions, but now can be changed on a per-call basis as arguments to np.array2string.

concatenate and stack gained an out argument

A preallocated buffer of the desired dtype can now be used for the output of these functions.

Support for PGI flang compiler on Windows

The PGI flang compiler is a Fortran front end for LLVM released by NVIDIA under the Apache 2 license. It can be invoked by ::

python setup.py config --compiler=clang --fcompiler=flang install

There is little experience with this new compiler, so any feedback from people using it will be appreciated.

Improvements

Numerator degrees of freedom in random.noncentral_f need only be positive.

Prior to NumPy 1.14.0, the numerator degrees of freedom needed to be > 1, but the distribution is valid for values > 0, which is the new requirement.

The GIL is released for all np.einsum variations

Some specific loop structures which have an accelerated loop version did not release the GIL prior to NumPy 1.14.0. This oversight has been fixed.

The np.einsum function will use BLAS when possible and optimize by default

The np.einsum function will now call np.tensordot when appropriate. Because np.tensordot uses BLAS when possible, that will speed up execution. By default, np.einsum will also attempt optimization as the overhead is small relative to the potential improvement in speed.

f2py now handles arrays of dimension 0

f2py now allows for the allocation of arrays of dimension 0. This allows for more consistent handling of corner cases downstream.

numpy.distutils supports using MSVC and mingw64-gfortran together

Numpy distutils now supports using Mingw64 gfortran and MSVC compilers together. This enables the production of Python extension modules on Windows containing Fortran code while retaining compatibility with the binaries distributed by Python.org. Not all use cases are supported, but most common ways to wrap Fortran for Python are functional.

Compilation in this mode is usually enabled automatically, and can be selected via the --fcompiler and --compiler options to setup.py. Moreover, linking Fortran codes to static OpenBLAS is supported; by default a gfortran compatible static archive openblas.a is looked for.

np.linalg.pinv now works on stacked matrices

Previously it was limited to a single 2d array.

numpy.save aligns data to 64 bytes instead of 16

Saving NumPy arrays in the npy format with numpy.save inserts padding before the array data to align it at 64 bytes. Previously this was only 16 bytes (and sometimes less due to a bug in the code for version 2). Now the alignment is 64 bytes, which matches the widest SIMD instruction set commonly available, and is also the most common cache line size. This makes npy files easier to use in programs which open them with mmap, especially on Linux where an mmap offset must be a multiple of the page size.

NPZ files now can be written without using temporary files

In Python 3.6+ numpy.savez and numpy.savez_compressed now write directly to a ZIP file, without creating intermediate temporary files.

Better support for empty structured and string types

Structured types can contain zero fields, and string dtypes can contain zero characters. Zero-length strings still cannot be created directly, and must be constructed through structured dtypes::

str0 = np.empty(10, np.dtype([('v', str, N)]))['v'] void0 = np.empty(10, np.void)

It was always possible to work with these, but the following operations are now supported for these arrays:

arr.sort()

arr.view(bytes)

arr.resize(...)

pickle.dumps(arr)

Support for decimal.Decimal in np.lib.financial

Unless otherwise stated all functions within the financial package now support using the decimal.Decimal built-in type.

Float printing now uses "dragon4" algorithm for shortest decimal representation

The str and repr of floating-point values (16, 32, 64 and 128 bit) are now printed to give the shortest decimal representation which uniquely identifies the value from others of the same type. Previously this was only true for float64 values. The remaining float types will now often be shorter than in numpy 1.13. Arrays printed in scientific notation now also use the shortest scientific representation, instead of fixed precision as before.

Additionally, the str of float scalars scalars will no longer be truncated in python2, unlike python2 floats. np.double scalars now have a str and repr identical to that of a python3 float.

New functions np.format_float_scientific and np.format_float_positional are provided to generate these decimal representations.

A new option floatmode has been added to np.set_printoptions and np.array2string, which gives control over uniqueness and rounding of printed elements in an array. The new default is floatmode='maxprec' with precision=8, which will print at most 8 fractional digits, or fewer if an element can be uniquely represented with fewer. A useful new mode is floatmode="unique", which will output enough digits to specify the array elements uniquely.

Numpy complex-floating-scalars with values like inf*j or nan*j now print as infj and nanj, like the pure-python complex type.

The FloatFormat and LongFloatFormat classes are deprecated and should both be replaced by FloatingFormat. Similarly ComplexFormat and LongComplexFormat should be replaced by ComplexFloatingFormat.

void datatype elements are now printed in hex notation

A hex representation compatible with the python bytes type is now printed for unstructured np.void elements, e.g., V4 datatype. Previously, in python2 the raw void data of the element was printed to stdout, or in python3 the integer byte values were shown.

printing style for void datatypes is now independently customizable

The printing style of np.void arrays is now independently customizable using the formatter argument to np.set_printoptions, using the 'void' key, instead of the catch-all numpystr key as before.

Reduced memory usage of np.loadtxt

np.loadtxt now reads files in chunks instead of all at once which decreases its memory usage significantly for large files.

Changes

Multiple-field indexing/assignment of structured arrays

The indexing and assignment of structured arrays with multiple fields has changed in a number of ways, as warned about in previous releases.

First, indexing a structured array with multiple fields, e.g., arr[['f1', 'f3']], returns a view into the original array instead of a copy. The returned view will have extra padding bytes corresponding to intervening fields in the original array, unlike the copy in 1.13, which will affect code such as arr[['f1', 'f3']].view(newdtype).

Second, assignment between structured arrays will now occur "by position" instead of "by field name". The Nth field of the destination will be set to the Nth field of the source regardless of field name, unlike in numpy versions 1.6 to 1.13 in which fields in the destination array were set to the identically-named field in the source array or to 0 if the source did not have a field.

Correspondingly, the order of fields in a structured dtypes now matters when computing dtype equality. For example, with the dtypes ::

x = dtype({'names': ['A', 'B'], 'formats': ['i4', 'f4'], 'offsets': [0, 4]}) y = dtype({'names': ['B', 'A'], 'formats': ['f4', 'i4'], 'offsets': [4, 0]})

the expression x == y will now return False, unlike before. This makes dictionary based dtype specifications like dtype({'a': ('i4', 0), 'b': ('f4', 4)}) dangerous in python < 3.6 since dict key order is not preserved in those versions.

Assignment from a structured array to a boolean array now raises a ValueError, unlike in 1.13, where it always set the destination elements to True.

Assignment from structured array with more than one field to a non-structured array now raises a ValueError. In 1.13 this copied just the first field of the source to the destination.

Using field "titles" in multiple-field indexing is now disallowed, as is repeating a field name in a multiple-field index.

The documentation for structured arrays in the user guide has been significantly updated to reflect these changes.

Integer and Void scalars are now unaffected by np.set_string_function

Previously, unlike most other numpy scalars, the str and repr of integer and void scalars could be controlled by np.set_string_function. This is no longer possible.

0d array printing changed, style arg of array2string deprecated

Previously the str and repr of 0d arrays had idiosyncratic implementations which returned str(a.item()) and 'array(' + repr(a.item()) + ')' respectively for 0d array a, unlike both numpy scalars and higher dimension ndarrays.

Now, the str of a 0d array acts like a numpy scalar using str(a[()]) and the repr acts like higher dimension arrays using formatter(a[()]), where formatter can be specified using np.set_printoptions. The style argument of np.array2string is deprecated.

This new behavior is disabled in 1.13 legacy printing mode, see compatibility notes above.

Seeding RandomState using an array requires a 1-d array

RandomState previously would accept empty arrays or arrays with 2 or more dimensions, which resulted in either a failure to seed (empty arrays) or for some of the passed values to be ignored when setting the seed.

MaskedArray objects show a more useful repr

The repr of a MaskedArray is now closer to the python code that would produce it, with arrays now being shown with commas and dtypes. Like the other formatting changes, this can be disabled with the 1.13 legacy printing mode in order to help transition doctests.

The repr of np.polynomial classes is more explicit

It now shows the domain and window parameters as keyword arguments to make them more clear::

>>> np.polynomial.Polynomial(range(4)) Polynomial([0., 1., 2., 3.], domain=[-1, 1], window=[-1, 1])

==========================

pandas 0.20.3 -> 0.22.0

0.22.0

This is a major release from 0.21.1 and includes a single, API-breaking change. We recommend that all users upgrade to this version after carefully reading the release note (singular!).

.. _whatsnew_0220.api_breaking:

Backwards incompatible API changes


Pandas 0.22.0 changes the handling of empty and all-*NA* sums and products. The
summary is that

* The sum of an empty or all-*NA* ``Series`` is now ``0``
* The product of an empty or all-*NA* ``Series`` is now ``1``
* We&#39;ve added a ``min_count`` parameter to ``.sum()`` and ``.prod()`` controlling
 the minimum number of valid values for the result to be valid. If fewer than
 ``min_count`` non-*NA* values are present, the result is *NA*. The default is
 ``0``. To return ``NaN``, the 0.21 behavior, use ``min_count=1``.

Some background: In pandas 0.21, we fixed a long-standing inconsistency
in the return value of all-*NA* series depending on whether or not bottleneck
was installed. See :ref:`whatsnew_0210.api_breaking.bottleneck`. At the same
time, we changed the sum and prod of an empty ``Series`` to also be ``NaN``.

Based on feedback, we&#39;ve partially reverted those changes.

Arithmetic Operations
^^^^^^^^^^^^^^^^^^^^^

The default sum for empty or all-*NA* ``Series`` is now ``0``.

*pandas 0.21.x*

.. code-block:: ipython

  In [1]: pd.Series([]).sum()
  Out[1]: nan

  In [2]: pd.Series([np.nan]).sum()
  Out[2]: nan

*pandas 0.22.0*

.. ipython:: python

  pd.Series([]).sum()
  pd.Series([np.nan]).sum()

The default behavior is the same as pandas 0.20.3 with bottleneck installed. It
also matches the behavior of NumPy&#39;s ``np.nansum`` on empty and all-*NA* arrays.

To have the sum of an empty series return ``NaN`` (the default behavior of
pandas 0.20.3 without bottleneck, or pandas 0.21.x), use the ``min_count``
keyword.

.. ipython:: python

  pd.Series([]).sum(min_count=1)

Thanks to the ``skipna`` parameter, the ``.sum`` on an all-*NA*
series is conceptually the same as the ``.sum`` of an empty one with
``skipna=True`` (the default).

.. ipython:: python

  pd.Series([np.nan]).sum(min_count=1)   skipna=True by default

The ``min_count`` parameter refers to the minimum number of *non-null* values
required for a non-NA sum or product.

:meth:`Series.prod` has been updated to behave the same as :meth:`Series.sum`,
returning ``1`` instead.

.. ipython:: python

  pd.Series([]).prod()
  pd.Series([np.nan]).prod()
  pd.Series([]).prod(min_count=1)

These changes affect :meth:`DataFrame.sum` and :meth:`DataFrame.prod` as well.
Finally, a few less obvious places in pandas are affected by this change.

Grouping by a Categorical
^^^^^^^^^^^^^^^^^^^^^^^^^

Grouping by a ``Categorical`` and summing now returns ``0`` instead of
``NaN`` for categories with no observations. The product now returns ``1``
instead of ``NaN``.

*pandas 0.21.x*

.. code-block:: ipython

  In [8]: grouper = pd.Categorical([&#39;a&#39;, &#39;a&#39;], categories=[&#39;a&#39;, &#39;b&#39;])

  In [9]: pd.Series([1, 2]).groupby(grouper).sum()
  Out[9]:
  a    3.0
  b    NaN
  dtype: float64

*pandas 0.22*

.. ipython:: python

  grouper = pd.Categorical([&#39;a&#39;, &#39;a&#39;], categories=[&#39;a&#39;, &#39;b&#39;])
  pd.Series([1, 2]).groupby(grouper).sum()

To restore the 0.21 behavior of returning ``NaN`` for unobserved groups,
use ``min_count&gt;=1``.

.. ipython:: python

  pd.Series([1, 2]).groupby(grouper).sum(min_count=1)

Resample
^^^^^^^^

The sum and product of all-*NA* bins has changed from ``NaN`` to ``0`` for
sum and ``1`` for product.

*pandas 0.21.x*

.. code-block:: ipython

  In [11]: s = pd.Series([1, 1, np.nan, np.nan],
     ...:                index=pd.date_range(&#39;2017&#39;, periods=4))
     ...:  s
  Out[11]:
  2017-01-01    1.0
  2017-01-02    1.0
  2017-01-03    NaN
  2017-01-04    NaN
  Freq: D, dtype: float64

  In [12]: s.resample(&#39;2d&#39;).sum()
  Out[12]:
  2017-01-01    2.0
  2017-01-03    NaN
  Freq: 2D, dtype: float64

*pandas 0.22.0*

.. ipython:: python

  s = pd.Series([1, 1, np.nan, np.nan],
                index=pd.date_range(&#39;2017&#39;, periods=4))
  s.resample(&#39;2d&#39;).sum()

To restore the 0.21 behavior of returning ``NaN``, use ``min_count&gt;=1``.

.. ipython:: python

  s.resample(&#39;2d&#39;).sum(min_count=1)

In particular, upsampling and taking the sum or product is affected, as
upsampling introduces missing values even if the original series was
entirely valid.

*pandas 0.21.x*

.. code-block:: ipython

  In [14]: idx = pd.DatetimeIndex([&#39;2017-01-01&#39;, &#39;2017-01-02&#39;])

  In [15]: pd.Series([1, 2], index=idx).resample(&#39;12H&#39;).sum()
  Out[15]:
  2017-01-01 00:00:00    1.0
  2017-01-01 12:00:00    NaN
  2017-01-02 00:00:00    2.0
  Freq: 12H, dtype: float64

*pandas 0.22.0*

.. ipython:: python

  idx = pd.DatetimeIndex([&#39;2017-01-01&#39;, &#39;2017-01-02&#39;])
  pd.Series([1, 2], index=idx).resample(&quot;12H&quot;).sum()

Once again, the ``min_count`` keyword is available to restore the 0.21 behavior.

.. ipython:: python

  pd.Series([1, 2], index=idx).resample(&quot;12H&quot;).sum(min_count=1)

Rolling and Expanding
^^^^^^^^^^^^^^^^^^^^^

Rolling and expanding already have a ``min_periods`` keyword that behaves
similar to ``min_count``. The only case that changes is when doing a rolling
or expanding sum with ``min_periods=0``. Previously this returned ``NaN``,
when fewer than ``min_periods`` non-*NA* values were in the window. Now it
returns ``0``.

*pandas 0.21.1*

.. code-block:: ipython

  In [17]: s = pd.Series([np.nan, np.nan])

  In [18]: s.rolling(2, min_periods=0).sum()
  Out[18]:
  0   NaN
  1   NaN
  dtype: float64

*pandas 0.22.0*

.. ipython:: python

  s = pd.Series([np.nan, np.nan])
  s.rolling(2, min_periods=0).sum()

The default behavior of ``min_periods=None``, implying that ``min_periods``
equals the window size, is unchanged.

Compatibility

If you maintain a library that should work across pandas versions, it may be easiest to exclude pandas 0.21 from your requirements. Otherwise, all your sum() calls would need to check if the Series is empty before summing.

With setuptools, in your setup.py use::

install_requires=['pandas!=0.21.*', ...]

With conda, use

.. code-block:: yaml

requirements: run:

pandas !=0.21.0,!=0.21.1

Note that the inconsistency in the return value for all-NA series is still there for pandas 0.20.3 and earlier. Avoiding pandas 0.21 will only help with the empty case.

.. _whatsnew_0211:

0.21.1

This is a minor bug-fix release in the 0.21.x series and includes some small regression fixes, bug fixes and performance improvements. We recommend that all users upgrade to this version.

Highlights include:

Temporarily restore matplotlib datetime plotting functionality. This should resolve issues for users who implicitly relied on pandas to plot datetimes with matplotlib. See :ref:here <whatsnew_0211.converters>.
Improvements to the Parquet IO functions introduced in 0.21.0. See :ref:here <whatsnew_0211.enhancements.parquet>.

.. contents:: What's new in v0.21.1 :local: :backlinks: none

.. _whatsnew_0211.converters:

Restore Matplotlib datetime Converter Registration


Pandas implements some matplotlib converters for nicely formatting the axis
labels on plots with ``datetime`` or ``Period`` values. Prior to pandas 0.21.0,
these were implicitly registered with matplotlib, as a side effect of ``import
pandas``.

In pandas 0.21.0, we required users to explicitly register the
converter. This caused problems for some users who relied on those converters
being present for regular ``matplotlib.pyplot`` plotting methods, so we&#39;re
temporarily reverting that change; pandas 0.21.1 again registers the converters on
import, just like before 0.21.0.

We&#39;ve added a new option to control the converters:
``pd.options.plotting.matplotlib.register_converters``. By default, they are
registered. Toggling this to ``False`` removes pandas&#39; formatters and restore
any converters we overwrote when registering them (:issue:`18301`).

We&#39;re working with the matplotlib developers to make this easier. We&#39;re trying
to balance user convenience (automatically registering the converters) with
import performance and best practices (importing pandas shouldn&#39;t have the side
effect of overwriting any custom converters you&#39;ve already set). In the future
we hope to have most of the datetime formatting functionality in matplotlib,
with just the pandas-specific converters in pandas. We&#39;ll then gracefully
deprecate the automatic registration of converters in favor of users explicitly
registering them when they want them.

.. _whatsnew_0211.enhancements:

New features

.. _whatsnew_0211.enhancements.parquet:

Improvements to the Parquet IO functionality ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

:func:DataFrame.to_parquet will now write non-default indexes when the underlying engine supports it. The indexes will be preserved when reading back in with :func:read_parquet (:issue:18581).
:func:read_parquet now allows to specify the columns to read from a parquet file (:issue:18154)
:func:read_parquet now allows to specify kwargs which are passed to the respective engine (:issue:18216)

.. _whatsnew_0211.enhancements.other:

Other Enhancements ^^^^^^^^^^^^^^^^^^

:meth:Timestamp.timestamp is now available in Python 2.7. (:issue:17329)
:class:Grouper and :class:TimeGrouper now have a friendly repr output (:issue:18203).

.. _whatsnew_0211.deprecations:

Deprecations


- ``pandas.tseries.register`` has been renamed to
 :func:`pandas.plotting.register_matplotlib_converters`` (:issue:`18301`)

.. _whatsnew_0211.performance:

Performance Improvements

Improved performance of plotting large series/dataframes (:issue:18236).

.. _whatsnew_0211.bug_fixes:

Bug Fixes



Conversion
^^^^^^^^^^

- Bug in :class:`TimedeltaIndex` subtraction could incorrectly overflow when ``NaT`` is present (:issue:`17791`)
- Bug in :class:`DatetimeIndex` subtracting datetimelike from DatetimeIndex could fail to overflow (:issue:`18020`)
- Bug in :meth:`IntervalIndex.copy` when copying and ``IntervalIndex`` with non-default ``closed`` (:issue:`18339`)
- Bug in :func:`DataFrame.to_dict` where columns of datetime that are tz-aware were not converted to required arrays when used with ``orient=&#39;records&#39;``, raising``TypeError` (:issue:`18372`)
- Bug in :class:`DateTimeIndex` and :meth:`date_range` where mismatching tz-aware ``start`` and ``end`` timezones would not raise an err if ``end.tzinfo`` is None (:issue:`18431`)
- Bug in :meth:`Series.fillna` which raised when passed a long integer on Python 2 (:issue:`18159`).

Indexing
^^^^^^^^

- Bug in a boolean comparison of a ``datetime.datetime`` and a ``datetime64[ns]`` dtype Series (:issue:`17965`)
- Bug where a ``MultiIndex`` with more than a million records was not raising ``AttributeError`` when trying to access a missing attribute (:issue:`18165`)
- Bug in :class:`IntervalIndex` constructor when a list of intervals is passed with non-default ``closed`` (:issue:`18334`)
- Bug in ``Index.putmask`` when an invalid mask passed (:issue:`18368`)
- Bug in masked assignment of a ``timedelta64[ns]`` dtype ``Series``, incorrectly coerced to float (:issue:`18493`)

I/O
^^^

- Bug in class:`~pandas.io.stata.StataReader` not converting date/time columns with display formatting addressed (:issue:`17990`). Previously columns with display formatting were normally left as ordinal numbers and not converted to datetime objects.
- Bug in :func:`read_csv` when reading a compressed UTF-16 encoded file (:issue:`18071`)
- Bug in :func:`read_csv` for handling null values in index columns when specifying ``na_filter=False`` (:issue:`5239`)
- Bug in :func:`read_csv` when reading numeric category fields with high cardinality (:issue:`18186`)
- Bug in :meth:`DataFrame.to_csv` when the table had ``MultiIndex`` columns, and a list of strings was passed in for ``header`` (:issue:`5539`)
- Bug in parsing integer datetime-like columns with specified format in ``read_sql`` (:issue:`17855`).
- Bug in :meth:`DataFrame.to_msgpack` when serializing data of the ``numpy.bool_`` datatype (:issue:`18390`)
- Bug in :func:`read_json` not decoding when reading line deliminted JSON from S3 (:issue:`17200`)
- Bug in :func:`pandas.io.json.json_normalize` to avoid modification of ``meta`` (:issue:`18610`)
- Bug in :func:`to_latex` where repeated multi-index values were not printed even though a higher level index differed from the previous row (:issue:`14484`)
- Bug when reading NaN-only categorical columns in :class:`HDFStore` (:issue:`18413`)
- Bug in :meth:`DataFrame.to_latex` with ``longtable=True`` where a latex multicolumn always spanned over three columns (:issue:`17959`)

Plotting
^^^^^^^^

- Bug in ``DataFrame.plot()`` and ``Series.plot()`` with :class:`DatetimeIndex` where a figure generated by them is not pickleable in Python 3 (:issue:`18439`)

Groupby/Resample/Rolling
^^^^^^^^^^^^^^^^^^^^^^^^

- Bug in ``DataFrame.resample(...).apply(...)`` when there is a callable that returns different columns (:issue:`15169`)
- Bug in ``DataFrame.resample(...)`` when there is a time change (DST) and resampling frequecy is 12h or higher (:issue:`15549`)
- Bug in ``pd.DataFrameGroupBy.count()`` when counting over a datetimelike column (:issue:`13393`)
- Bug in ``rolling.var`` where calculation is inaccurate with a zero-valued array (:issue:`18430`)

Reshaping
^^^^^^^^^

- Error message in ``pd.merge_asof()`` for key datatype mismatch now includes datatype of left and right key (:issue:`18068`)
- Bug in ``pd.concat`` when empty and non-empty DataFrames or Series are concatenated (:issue:`18178` :issue:`18187`)
- Bug in ``DataFrame.filter(...)`` when :class:`unicode` is passed as a condition in Python 2 (:issue:`13101`)
- Bug when merging empty DataFrames when ``np.seterr(divide=&#39;raise&#39;)`` is set (:issue:`17776`)

Numeric
^^^^^^^

- Bug in ``pd.Series.rolling.skew()`` and ``rolling.kurt()`` with all equal values has floating issue (:issue:`18044`)

Categorical
^^^^^^^^^^^

- Bug in :meth:`DataFrame.astype` where casting to &#39;category&#39; on an empty ``DataFrame`` causes a segmentation fault (:issue:`18004`)
- Error messages in the testing module have been improved when items have different ``CategoricalDtype`` (:issue:`18069`)
- ``CategoricalIndex`` can now correctly take a ``pd.api.types.CategoricalDtype`` as its dtype (:issue:`18116`)
- Bug in ``Categorical.unique()`` returning read-only ``codes``  array when all categories were ``NaN`` (:issue:`18051`)
- Bug in ``DataFrame.groupby(axis=1)`` with a ``CategoricalIndex`` (:issue:`18432`)

String
^^^^^^

- :meth:`Series.str.split()` will now propagate ``NaN`` values across all expanded columns instead of ``None`` (:issue:`18450`)

.. _whatsnew_060:

### 0.21.0

--------------------------

This is a major release from 0.20.3 and includes a number of API changes, deprecations, new features,
enhancements, and performance improvements along with a large number of bug fixes. We recommend that all
users upgrade to this version.

Highlights include:

- Integration with `Apache Parquet &lt;https://parquet.apache.org/&gt;`__, including a new top-level :func:`read_parquet` function and :meth:`DataFrame.to_parquet` method, see :ref:`here &lt;whatsnew_0210.enhancements.parquet&gt;`.
- New user-facing :class:`pandas.api.types.Categoric

jason-neal / companion_simulations