Scheduled daily dependency update on thursday

Updates

Here's a list of all the updates bundled in this pull request. I've added some links to make it easier for you to find all the information you need.

celery	4.0.2	»	4.1.0	PyPI \| Changelog \| Homepage \| Docs
earthengine-api	0.1.117	»	0.1.119	PyPI \| Changelog \| Homepage
numba	0.34.0	»	0.34.0	PyPI \| Changelog \| Repo
psycopg2	2.7.1	»	2.7.3	PyPI \| Changelog \| Homepage \| Docs
requests	2.18.1	»	2.18.2	PyPI \| Changelog \| Homepage

Changelogs

celery 4.0.2 -> 4.1.0

4.1.0

===== :release-date: 2017-07-25 00:00 PM PST :release-by: Omer Katz

Configuration: CELERY_SEND_EVENTS instead of CELERYD_SEND_EVENTS for 3.1.x compatibility (3997)

Contributed by abhinav nilaratna.

App: Restore behavior so Broadcast queues work. (3934)

Contributed by Patrick Cloke.

Sphinx: Make appstr use standard format (4134) (4139)

Contributed by Preston Moore.

App: Make id, name always accessible from logging.Formatter via extra (3994)

Contributed by Yoichi NAKAYAMA.

Worker: Add worker_shutting_down signal (3998)

Contributed by Daniel Huang.

PyPy: Support PyPy version 5.8.0 (4128)

Contributed by Omer Katz.

Results: Elasticsearch: Fix serializing keys (3924)

Contributed by :github_user:staticfox.

Canvas: Deserialize all tasks in a chain (4015)

Contributed by :github_user:fcoelho.

Systemd: Recover loglevel for ExecStart in systemd config (4023)

Contributed by Yoichi NAKAYAMA.

Sphinx: Use the Sphinx add_directive_to_domain API. (4037)

Contributed by Patrick Cloke.

App: Pass properties to before_task_publish signal (4035)

Contributed by Javier Domingo Cansino.

Results: Add SSL option for Redis backends (3831)

Contributed by Chris Kuehl.

Beat: celery.schedule.crontab: fix reduce (3826) (3827)

Contributed by Taylor C. Richberger.

State: Fix celery issues when using flower REST API

Contributed by Thierry RAMORASOAVINA.

Results: Elasticsearch: Fix serializing document id.

Contributed by Acey9.

Beat: Make shallow copy of schedules dictionary

Contributed by Brian May.

Beat: Populate heap when periodic tasks are changed

Contributed by Wojciech Żywno.

Task: Allow class methods to define tasks (3952)

Contributed by georgepsarakis.

Platforms: Always return boolean value when checking if signal is supported (3962).

Contributed by Jian Yu.

Canvas: Avoid duplicating chains in chords (3779)

Contributed by Ryan Hiebert.

Canvas: Lookup task only if list has items (3847)

Contributed by Marc Gibbons.

Results: Allow unicode message for exception raised in task (3903)

Contributed by George Psarakis.

Python3: Support for Python 3.6 (3904, 3903, 3736)

Contributed by Jon Dufresne, George Psarakis, Asif Saifuddin Auvi, Omer Katz.

App: Fix retried tasks with expirations (3790)

Contributed by Brendan MacDonell.

Fixes items format route in docs (3875)

Contributed by Slam.

Utils: Fix maybe_make_aware (3850)

Contributed by Taylor C. Richberger.

Task: Fix task ETA issues when timezone is defined in configuration (3867)

Contributed by George Psarakis.

Concurrency: Consumer does not shutdown properly when embedded in gevent application (3746)

Contributed by Arcadiy Ivanov.

Canvas: Fix 3725: Task replaced with group does not complete (3731)

Contributed by Morgan Doocy.

Task: Correct order in chains with replaced tasks (3730)

Contributed by Morgan Doocy.

Result: Enable synchronous execution of sub-tasks (3696)

Contributed by shalev67.

Task: Fix request context for blocking task apply (added hostname) (3716)

Contributed by Marat Sharafutdinov.

Utils: Fix task argument handling (3678) (3693)

Contributed by Roman Sichny.

Beat: Provide a transparent method to update the Scheduler heap (3721)

Contributed by Alejandro Pernin.

Beat: Specify default value for pidfile option of celery beat. (3722)

Contributed by Arnaud Rocher.

Results: Elasticsearch: Stop generating a new field every time when a new result is being put (3708)

Contributed by Mike Chen.

Requirements

Now depends on :ref:Kombu 4.1.0 <kombu:version-4.1.0>.

Results: Elasticsearch now reuses fields when new results are added.

Contributed by Mike Chen.

Results: Fixed MongoDB integration when using binary encodings (Issue 3575).

Contributed by Andrew de Quincey.

Worker: Making missing *args and **kwargs in Task protocol 1 return empty value in protocol 2 (Issue 3687).

Contributed by Roman Sichny.

App: Fixed :exc:TypeError in AMQP when using deprecated signal (Issue 3707).

Contributed by :github_user:michael-k.

Beat: Added a transparent method to update the scheduler heap.

Contributed by Alejandro Pernin.

Task: Fixed handling of tasks with keyword arguments on Python 3 (Issue 3657).

Contributed by Roman Sichny.

Task: Fixed request context for blocking task apply by adding missing hostname attribute.

Contributed by Marat Sharafutdinov.

Task: Added option to run subtasks synchronously with disable_sync_subtasks argument.

Contributed by :github_user:shalev67.

App: Fixed chaining of replaced tasks (Issue 3726).

Contributed by Morgan Doocy.

Canvas: Fixed bug where replaced tasks with groups were not completing (Issue 3725).

Contributed by Morgan Doocy.

Worker: Fixed problem where consumer does not shutdown properly when embedded in a gevent application (Issue 3745).

Contributed by Arcadiy Ivanov.

Results: Added support for using AWS DynamoDB as a result backend (3736).

Contributed by George Psarakis.

Testing: Added caching on pip installs.

Contributed by :github_user:orf.

Worker: Prevent consuming queue before ready on startup (Issue 3620).

Contributed by Alan Hamlett.

App: Fixed task ETA issues when timezone is defined in configuration (Issue 3753).

Contributed by George Psarakis.

Utils: maybe_make_aware should not modify datetime when it is already timezone-aware (Issue 3849).

Contributed by Taylor C. Richberger.

App: Fixed retrying tasks with expirations (Issue 3734).

Contributed by Brendan MacDonell.

Results: Allow unicode message for exceptions raised in task (Issue 3858).

Contributed by :github_user:staticfox.

Canvas: Fixed :exc:IndexError raised when chord has an empty header.

Contributed by Marc Gibbons.

Canvas: Avoid duplicating chains in chords (Issue 3771).

Contributed by Ryan Hiebert and George Psarakis.

Utils: Allow class methods to define tasks (Issue 3863).

Contributed by George Psarakis.

Beat: Populate heap when periodic tasks are changed.

Contributed by :github_user:wzywno and Brian May.

Results: Added support for Elasticsearch backend options settings.

Contributed by :github_user:Acey9.

Events: Ensure Task.as_dict() works when not all information about task is available.

Contributed by :github_user:tramora.

Schedules: Fixed pickled crontab schedules to restore properly (Issue 3826).

Contributed by Taylor C. Richberger.

Results: Added SSL option for redis backends (Issue 3830).

Contributed by Chris Kuehl.

Documentation and examples improvements by:

Bruno Alla

Jamie Alessio

Vivek Anand

Peter Bittner

Kalle Bronsen

Jon Dufresne

James Michael DuPont

Sergey Fursov

Samuel Dion-Girardeau

Daniel Hahler

Mike Helmick

Marc Hörsken

Christopher Hoskin

Daniel Huang

Primož Kerin

Michal Kuffa

Simon Legner

Anthony Lukach

Ed Morley

Jay McGrath

Rico Moorman

Viraj Navkal

Ross Patterson

Dmytro Petruk

Luke Plant

Eric Poelke

Salvatore Rinchiera

Arnaud Rocher

Kirill Romanov

Simon Schmidt

Tamer Sherif

YuLun Shih

Ask Solem

Tom 'Biwaa' Riat

Arthur Vigil

Joey Wilhelm

Jian Yu

YuLun Shih

Arthur Vigil

Joey Wilhelm

:github_user:baixuexue123

:github_user:bronsen

:github_user:michael-k

:github_user:orf

:github_user:3lnc

numba -> 0.34.0

0.34.0

This release adds a significant set of new features arising from combined work with Intel on ParallelAccelerator technology. It also adds list comprehension and closure support, support for Numpy 1.13 and a new, faster, CUDA reduction algorithm. For Linux users this release is the first to be built on Centos 6, which will be the new base platform for future releases. Finally a number of thread-safety, type inference and other smaller enhancements and bugs have been fixed.

ParallelAccelerator features:

NOTE: The ParallelAccelerator technology is under active development and should be considered experimental.

The ParallelAccelerator technology is accessed via a new "nopython" mode option "parallel". The ParallelAccelerator technology attempts to identify operations which have parallel semantics (for instance adding a scalar to a vector), fuse together adjacent such operations, and then parallelize their execution across a number of CPU cores. This is essentially auto-parallelization.

In addition to the auto-parallelization feature, explicit loop based parallelism is made available through the use of prange in place of range as a loop iterator.

More information and examples on both auto-parallelization and prange are available in the documentation and examples directory respectively.

As part of the necessary work for ParallelAccelerator, support for closures and list comprehensions is added:

PR 2318: Transfer ParallelAccelerator technology to Numba

PR 2379: ParallelAccelerator Core Improvements

PR 2367: Add support for len(range(...))

PR 2369: List comprehension

PR 2391: Explicit Parallel Loop Support (prange)

The ParallelAccelerator features are available on all supported platforms and Python versions with the exceptions of (with view of supporting in a future release):

The combination of Windows operating systems with Python 2.7.

Systems running 32 bit Python.

CUDA support enhancements:

PR 2377: New GPU reduction algorithm

CUDA support fixes:

PR 2397: Fix 2393, always set alignment of cuda static memory regions

Misc Fixes:

PR 2373, Issue 2372: 32-bit compatibility fix for parfor related code

PR 2376: Fix 2375 missing stdint.h for py2.7 vc9

PR 2378: Fix deadlock in parallel gufunc when kernel acquires the GIL.

PR 2382: Forbid unsafe casting in bitwise operation

PR 2385: docs: fix Sphinx errors

PR 2396: Use 64-bit RHS operand for shift

PR 2404: Fix threadsafety logic issue in ufunc compilation cache.

PR 2424: Ensure consistent iteration order of blocks for type inference.

PR 2425: Guard code to prevent the use of 'parallel' on win32 + py27

PR 2426: Basic test for Enum member type recovery.

PR 2433: Fix up the parfors tests with respect to windows py2.7

PR 2442: Skip tests that need BLAS/LAPACK if scipy is not available.

PR 2444: Add test for invalid array setitem

PR 2449: Make the runtime initialiser threadsafe

PR 2452: Skip CFG test on 64bit windows

Misc Enhancements:

PR 2366: Improvements to IR utils

PR 2388: Update README.rst to indicate the proper version of LLVM

PR 2394: Upgrade to llvmlite 0.19.*

PR 2395: Update llvmlite version to 0.19

PR 2406: Expose environment object to ufuncs

PR 2407: Expose environment object to target-context inside lowerer

PR 2413: Add flags to pass through to conda build for buildbot

PR 2414: Add cross compile flags to local recipe

PR 2415: A few cleanups for rewrites

PR 2418: Add getitem support for Enum classes

PR 2419: Add support for returning enums in vectorize

PR 2421: Add copyright notice for Intel contributed files.

PR 2422: Patch code base to work with np 1.13 release

PR 2448: Adds in warning message when using 'parallel' if cache=True

PR 2450: Add test for keyword arg on .sum-like and .cumsum-like array methods

0.33.0

This release resolved several performance issues caused by atomic reference counting operations inside loop bodies. New optimization passes have been added to reduce the impact of these operations. We observe speed improvements between 2x-10x in affected programs due to the removal of unnecessary reference counting operations.

There are also several enhancements to the CUDA GPU support:

A GPU random number generator based on xoroshiro128+ algorithm <http://xoroshiro.di.unimi.it/>_ is added. See details and examples in :ref:documentation <cuda-random>.

cuda.jit CUDA kernels can now call jit and njit CPU functions and they will automatically be compiled as CUDA device functions.

CUDA IPC memory API is exposed for sharing memory between proceses. See usage details in :ref:documentation <cuda-ipc-memory>.

Reference counting enhancements:

PR 2346, Issue 2345, 2248: Add extra refcount pruning after inlining

PR 2349: Fix refct pruning not removing refct op with tail call.

PR 2352, Issue 2350: Add refcount pruning pass for function that does not need refcount

CUDA support enhancements:

PR 2023: Supports CUDA IPC for device array

PR 2343, Issue 2335: Allow CPU jit decorated function to be used as cuda device function

PR 2347: Add random number generator support for CUDA device code

PR 2361: Update autotune table for CC: 5.3, 6.0, 6.1, 6.2

Misc fixes:

PR 2362: Avoid test failure due to typing to int32 on 32-bit platforms

PR 2359: Fixed nogil example that threw a TypeError when executed.

PR 2357, Issue 2356: Fix fragile test that depends on how the script is executed.

PR 2355: Fix cpu dispatcher referenced as attribute of another module

PR 2354: Fixes an issue with caching when function needs NRT and refcount pruning

PR 2342, Issue 2339: Add warnings to inspection when it is used on unserialized cached code

PR 2329, Issue 2250: Better handling of missing op codes

Misc enhancements:

PR 2360: Adds missing values in error mesasge interp.

PR 2353: Handle when get_host_cpu_features() raises RuntimeError

PR 2351: Enable SVML for erf/erfc/gamma/lgamma/log2

PR 2344: Expose error_model setting in jit decorator

PR 2337: Align blocking terminate support for fork() with new TBB version

PR 2336: Bump llvmlite version to 0.18

PR 2330: Core changes in PR 2318

0.32.0

In this release, we are upgrading to LLVM 4.0. A lot of work has been done to fix many race-condition issues inside LLVM when the compiler is used concurrently, which is likely when Numba is used with Dask.

Improvements:

PR 2322: Suppress test error due to unknown but consistent error with tgamma

PR 2320: Update llvmlite dependency to 0.17

PR 2308: Add details to error message on why cuda support is disabled.

PR 2302: Add os x to travis

PR 2294: Disable remove_module on MCJIT due to memory leak inside LLVM

PR 2291: Split parallel tests and recycle workers to tame memory usage

PR 2253: Remove the pointer-stuffing hack for storing meminfos in lists

Fixes:

PR 2331: Fix a bug in the GPU array indexing

PR 2326: Fix 2321 docs referring to non-existing function.

PR 2316: Fixing more race-condition problems

PR 2315: Fix 2314. Relax strict type check to allow optional type.

PR 2310: Fix race condition due to concurrent compilation and cache loading

PR 2304: Fix intrinsic 1st arg not a typing.Context as stated by the docs.

PR 2287: Fix int64 atomic min-max

PR 2286: Fix 2285 overload_method not linking dependent libs

PR 2303: Missing import statements to interval-example.rst

0.31.0

In this release, we added preliminary support for debugging with GDB version >= 7.0. The feature is enabled by setting the debug=True compiler option, which causes GDB compatible debug info to be generated. The CUDA backend also gained limited debugging support so that source locations are showed in memory-checking and profiling tools. For details, see :ref:numba-troubleshooting.

Also, we added the fastmath=True compiler option to enable unsafe floating-point transformations, which allows LLVM to auto-vectorize more code.

Other important changes include upgrading to LLVM 3.9.1 and adding support for Numpy 1.12.

Improvements:

PR 2281: Update for numpy1.12

PR 2278: Add CUDA atomic.{max, min, compare_and_swap}

PR 2277: Add about section to conda recipies to identify license and other metadata in Anaconda Cloud

PR 2271: Adopt itanium C++-style mangling for CPU and CUDA targets

PR 2267: Add fastmath flags

PR 2261: Support dtype.type

PR 2249: Changes for llvm3.9

PR 2234: Bump llvmlite requirement to 0.16 and add install_name_tool_fixer to mviewbuf for OS X

PR 2230: Add python3.6 to TravisCi

PR 2227: Enable caching for gufunc wrapper

PR 2170: Add debugging support

PR 2037: inspect_cfg() for easier visualization of the function operation

Fixes:

PR 2274: Fix nvvm ir patch in mishandling "load"

PR 2272: Fix breakage to cuda7.5

PR 2269: Fix caching of copy_strides kernel in cuda.reduce

PR 2265: Fix 2263: error when linking two modules with dynamic globals

PR 2252: Fix path separator in test

PR 2246: Fix overuse of memory in some system with fork

PR 2241: Fix 2240: module in dynamically created function not a str

PR 2239: Fix fingerprint computation failure preventing fallback

0.30.1

This is a bug-fix release to enable Python 3.6 support. In addition, there is now early Intel TBB support for parallel ufuncs when building from source with TBBROOT defined. The TBB feature is not enabled in our official builds.

Fixes:

PR 2232: Fix name clashes with _Py_hashtable_xxx in Python 3.6.

Improvements:

PR 2217: Add Intel TBB threadpool implementation for parallel ufunc.

0.30.0

This release adds preliminary support for Python 3.6, but no official build is available yet. A new system reporting tool (numba --sysinfo) is added to provide system information to help core developers in replication and debugging. See below for other improvements and bug fixes.

Improvements:

PR 2209: Support Python 3.6.

PR 2175: Support np.trace(), np.outer() and np.kron().

PR 2197: Support np.nanprod().

PR 2190: Support caching for ufunc.

PR 2186: Add system reporting tool.

Fixes:

PR 2214, Issue 2212: Fix memory error with ndenumerate and flat iterators.

PR 2206, Issue 2163: Fix zip() consuming extra elements in early exhaustion.

PR 2185, Issue 2159, 2169: Fix rewrite pass affecting objmode fallback.

PR 2204, Issue 2178: Fix annotation for liftedloop.

PR 2203: Fix Appveyor segfault with Python 3.5.

PR 2202, Issue 2198: Fix target context not initialized when loading from ufunc cache.

PR 2172, Issue 2171: Fix optional type unpacking.

PR 2189, Issue 2188: Disable freezing of big (>1MB) global arrays.

PR 2180, Issue 2179: Fix invalid variable version in looplifting.

PR 2156, Issue 2155: Fix divmod, floordiv segfault on CUDA.

0.29.0

This release extends the support of recursive functions to include direct and indirect recursion without explicit function type annotations. See new example in examples/mergesort.py. Newly supported numpy features include array stacking functions, np.linalg.eig* functions, np.linalg.matrix_power, np.roots and array to array broadcasting in assignments.

This release depends on llvmlite 0.14.0 and supports CUDA 8 but it is not required.

Improvements:

PR 2130, 2137: Add type-inferred recursion with docs and examples.

PR 2134: Add np.linalg.matrix_power.

PR 2125: Add np.roots.

PR 2129: Add np.linalg.{eigvals,eigh,eigvalsh}.

PR 2126: Add array-to-array broadcasting.

PR 2069: Add hstack and related functions.

PR 2128: Allow for vectorizing a jitted function. (thanks to dhirschfeld)

PR 2117: Update examples and make them test-able.

PR 2127: Refactor interpreter class and its results.

Fixes:

PR 2149: Workaround MSVC9.0 SP1 fmod bug kb982107.

PR 2145, Issue 2009: Fixes kwargs for jitclass __init__ method.

PR 2150: Fix slowdown in objmode fallback.

PR 2050, Issue 1259: Fix liveness problem with some generator loops.

PR 2072, Issue 1995: Right shift of unsigned LHS should be logical.

PR 2115, Issue 1466: Fix inspect_types() error due to mangled variable name.

PR 2119, Issue 2118: Fix array type created from record-dtype.

PR 2122, Issue 1808: Fix returning a generator due to datamodel error.

0.28.1

This is a bug-fix release to resolve packaging issues with setuptools dependency.

0.28.0

Amongst other improvements, this version improves again the level of support for linear algebra -- functions from the :mod:numpy.linalg module. Also, our random generator is now guaranteed to be thread-safe and fork-safe.

Improvements:

PR 2019: Add the intrinsic decorator to define low-level subroutines callable from JIT functions (this is considered a private API for now).

PR 2059: Implement np.concatenate and np.stack.

PR 2048: Make random generation fork-safe and thread-safe, producing independent streams of random numbers for each thread or process.

PR 2031: Add documentation of floating-point pitfalls.

Issue 2053: Avoid polling in parallel CPU target (fixes severe performance regression on Windows).

Issue 2029: Make default arguments fast.

PR 2052: Add logging to the CUDA driver.

PR 2049: Implement the built-in divmod() function.

PR 2036: Implement the argsort() method on arrays.

PR 2046: Improving CUDA memory management by deferring deallocations until certain thresholds are reached, so as to avoid breaking asynchronous execution.

PR 2040: Switch the CUDA driver implementation to use CUDA's "primary context" API.

PR 2017: Allow min(tuple) and max(tuple).

PR 2039: Reduce fork() detection overhead in CUDA.

PR 2021: Handle structured dtypes with titles.

PR 1996: Rewrite looplifting as a transformation on Numba IR.

PR 2014: Implement np.linalg.matrix_rank.

PR 2012: Implement np.linalg.cond.

PR 1985: Rewrite even trivial array expressions, which opens the door for other optimizations (for example, array ** 2 can be converted into array * array).

PR 1950: Have typeof() always raise ValueError on failure. Previously, it would either raise or return None, depending on the input.

PR 1994: Implement np.linalg.norm.

PR 1987: Implement np.linalg.det and np.linalg.slogdet.

Issue 1979: Document integer width inference and how to workaround.

PR 1938: Numba is now compatible with LLVM 3.8.

PR 1967: Restrict np.linalg functions to homogenous dtypes. Users wanting to pass mixed-typed inputs have to convert explicitly, which makes the performance implications more obvious.

Fixes:

PR 2006: array(float32) ** int should return array(float32).

PR 2044: Allow reshaping empty arrays.

Issue 2051: Fix refcounting issue when concatenating tuples.

Issue 2000: Make Numpy optional for setup.py, to allow pip install to work without Numpy pre-installed.

PR 1989: Fix assertion in Dispatcher.disable_compile().

Issue 2028: Ignore filesystem errors when caching from multiple processes.

Issue 2003: Allow unicode variable and function names (on Python 3).

Issue 1998: Fix deadlock in parallel ufuncs that reacquire the GIL.

PR 1997: Fix random crashes when AOT compiling on certain Windows platforms.

Issue 1988: Propagate jitclass docstring.

Issue 1933: Ensure array constants are emitted with the right alignment.

0.27.0

Improvements:

Issue 1976: improve error message when non-integral dimensions are given to a CUDA kernel.

PR 1970: Optimize the power operator with a static exponent.

PR 1710: Improve contextual information for compiler errors.

PR 1961: Support printing constant strings.

PR 1959: Support more types in the print() function.

PR 1823: Support compute_50 in CUDA backend.

PR 1955: Support np.linalg.pinv.

PR 1896: Improve the SmartArray API.

PR 1947: Support np.linalg.solve.

Issue 1943: Improve error message when an argument fails typing.4

PR 1927: Support np.linalg.lstsq.

PR 1934: Use system functions for hypot() where possible, instead of our own implementation.

PR 1929: Add cffi support to cfunc objects.

PR 1932: Add user-controllable thread pool limits for parallel CPU target.

PR 1928: Support self-recursion when the signature is explicit.

PR 1890: List all lowering implementations in the developer docs.

Issue 1884: Support np.lib.stride_tricks.as_strided().

Fixes:

Issue 1960: Fix sliced assignment when source and destination areas are overlapping.

PR 1963: Make CUDA print() atomic.

PR 1956: Allow 0d array constants.

Issue 1945: Allow using Numpy ufuncs in AOT compiled code.

Issue 1916: Fix documentation example for generated_jit.

Issue 1926: Fix regression when caching functions in an IPython session.

Issue 1923: Allow non-intp integer arguments to carray() and farray().

Issue 1908: Accept non-ASCII unicode docstrings on Python 2.

Issue 1874: Allow del container[key] in object mode.

Issue 1913: Fix set insertion bug when the lookup chain contains deleted entries.

Issue 1911: Allow function annotations on jitclass methods.

0.26.0

This release adds support for cfunc decorator for exporting numba jitted functions to 3rd party API that takes C callbacks. Most of the overhead of using jitclasses inside the interpreter are eliminated. Support for decompositions in numpy.linalg are added. Finally, Numpy 1.11 is supported.

Improvements:

PR 1889: Export BLAS and LAPACK wrappers for pycc.

PR 1888: Faster array power.

Issue 1867: Allow "out" keyword arg for dufuncs.

PR 1871: carray() and farray() for creating arrays from pointers.

PR 1855: cfunc decorator for exporting as ctypes function.

PR 1862: Add support for numpy.linalg.qr.

PR 1851: jitclass support for '_' and '__' prefixed attributes.

PR 1842: Optimize jitclass in Python interpreter.

Issue 1837: Fix CUDA simulator issues with device function.

PR 1839: Add support for decompositions from numpy.linalg.

PR 1829: Support Python enums.

PR 1828: Add support for numpy.random.rand()``` and numpy.random.randn()``

Issue 1825: Use of 0-darray in place of scalar index.

Issue 1824: Scalar arguments to object mode gufuncs.

Issue 1813: Let bitwise bool operators return booleans, not integers.

Issue 1760: Optional arguments in generators.

PR 1780: Numpy 1.11 support.

0.25.0

This release adds support for set objects in nopython mode. It also adds support for many missing Numpy features and functions. It improves Numba's compatibility and performance when using a distributed execution framework such as dask, distributed or Spark. Finally, it removes compatibility with Python 2.6, Python 3.3 and Numpy 1.6.

Improvements:

Issue 1800: Add erf(), erfc(), gamma() and lgamma() to CUDA targets.

PR 1793: Implement more Numpy functions: np.bincount(), np.diff(), np.digitize(), np.histogram(), np.searchsorted() as well as NaN-aware reduction functions (np.nansum(), np.nanmedian(), etc.)

PR 1789: Optimize some reduction functions such as np.sum(), np.prod(), np.median(), etc.

PR 1752: Make CUDA features work in dask, distributed and Spark.

PR 1787: Support np.nditer() for fast multi-array indexing with broadcasting.

PR 1799: Report JIT-compiled functions as regular Python functions when profiling (allowing to see the filename and line number where a function is defined).

PR 1782: Support np.any() and np.all().

Issue 1788: Support the iter() and next() built-in functions.

PR 1778: Support array.astype().

Issue 1775: Allow the user to set the target CPU model for AOT compilation.

PR 1758: Support creating random arrays using the size parameter to the np.random APIs.

PR 1757: Support len() on array.flat objects.

PR 1749: Remove Numpy 1.6 compatibility.

PR 1748: Remove Python 2.6 and 3.3 compatibility.

PR 1735: Support the not in operator as well as operator.contains().

PR 1724: Support homogenous sets in nopython mode.

Issue 875: make compilation of array constants faster.

Fixes:

PR 1795: Fix a massive performance issue when calling Numba functions with distributed, Spark or a similar mechanism using serialization.

Issue 1784: Make jitclasses usable with NUMBA_DISABLE_JIT=1.

Issue 1786: Allow using linear algebra functions when profiling.

Issue 1796: Fix np.dot() memory leak on non-contiguous inputs.

PR 1792: Fix static negative indexing of tuples.

Issue 1771: Use fallback cache directory when pycache isn't writable, such as when user code is installed in a system location.

Issue 1223: Use Numpy error model in array expressions (e.g. division by zero returns inf or nan instead of raising an error).

Issue 1640: Fix np.random.binomial() for large n values.

Issue 1643: Improve error reporting when passing an invalid spec to jitclass.

PR 1756: Fix slicing with a negative step and an omitted start.

0.24.0

This release introduces several major changes, including the generated_jit decorator for flexible specializations as with Julia's "generated" macro, or the SmartArray array wrapper type that allows seamless transfer of array data between the CPU and the GPU.

This will be the last version to support Python 2.6, Python 3.3 and Numpy 1.6.

Improvements:

PR 1723: Improve compatibility of JIT functions with the Python profiler.

PR 1509: Support array.ravel() and array.flatten().

PR 1676: Add SmartArray type to support transparent data management in multiple address spaces (host & GPU).

PR 1689: Reduce startup overhead of importing Numba.

PR 1705: Support registration of CFFI types as corresponding to known Numba types.

PR 1686: Document the extension API.

PR 1698: Improve warnings raised during type inference.

PR 1697: Support np.dot() and friends on non-contiguous arrays.

PR 1692: cffi.from_buffer() improvements (allow more pointer types, allow non-Numpy buffer objects).

PR 1648: Add the generated_jit decorator.

PR 1651: Implementation of np.linalg.inv using LAPACK. Thanks to Matthieu Dartiailh.

PR 1674: Support np.diag().

PR 1673: Improve error message when looking up an attribute on an unknown global.

Issue 1569: Implement runtime check for the LLVM locale bug.

PR 1612: Switch to LLVM 3.7 in sync with llvmlite.

PR 1624: Allow slice assignment of sequence to array.

PR 1622: Support slicing tuples with a constant slice.

Fixes:

Issue 1722: Fix returning an optional boolean (bool or None).

Issue 1734: NRT decref bug when variable is del'ed before being defined, leading to a possible memory leak.

PR 1732: Fix tuple getitem regression for CUDA target.

PR 1718: Mishandling of optional to optional casting.

PR 1714: Fix .compile() on a JIT function not respecting ._can_compile.

Issue 1667: Fix np.angle() on arrays.

Issue 1690: Fix slicing with an omitted stop and a negative step value.

PR 1693: Fix gufunc bug in handling scalar formal arg with non-scalar input value.

PR 1683: Fix parallel testing under Windows.

Issue 1616: Use system-provided versions of C99 math where possible.

Issue 1652: Reductions of bool arrays (e.g. sum() or mean()) should return integers or floats, not bools.

Issue 1664: Fix regression when indexing a record array with a constant index.

PR 1661: Disable AVX on old Linux kernels.

Issue 1636: Allow raising an exception looked up on a module.

0.23.1

This is a bug-fix release to address several regressions introduced in the 0.23.0 release, and a couple other issues.

Fixes:

Issue 1645: CUDA ufuncs were broken in 0.23.0.

Issue 1638: Check tuple sizes when passing a list of tuples.

Issue 1630: Parallel ufunc would keep eating CPU even after finishing under Windows.

Issue 1628: Fix ctypes and cffi tests under Windows with Python 3.5.

Issue 1627: Fix xrange() support.

PR 1611: Rewrite variable liveness analysis.

Issue 1610: Allow nested calls between explicitly-typed ufuncs.

Issue 1593: Fix *args in object mode.

0.23.0

This release introduces JIT classes using the new jitclass decorator, allowing user-defined structures for nopython mode. Other improvements and bug fixes are listed below.

Improvements:

PR 1609: Speed up some simple math functions by inlining them in their caller

PR 1571: Implement JIT classes

PR 1584: Improve typing of array indexing

PR 1583: Allow printing booleans

PR 1542: Allow negative values in np.reshape()

PR 1560: Support vector and matrix dot product, including np.dot() and the ```` operator in Python 3.5

PR 1546: Support field lookup on record arrays and scalars (i.e. array['field'] in addition to array.field)

PR 1440: Support the HSA wavebarrier() and activelanepermute_wavewidth() intrinsics

PR 1540: Support np.angle()

PR 1543: Implement CPU multithreaded gufuncs (target="parallel")

PR 1551: Allow scalar arguments in np.where(), np.empty_like().

PR 1516: Add some more examples from NumbaPro

PR 1517: Support np.sinc()

Fixes:

Issue 1603: Fix calling a non-cached function from a cached function

Issue 1594: Ensure a list is homogenous when unboxing

Issue 1595: Replace deprecated use of get_pointer_to_function()

Issue 1586: Allow tests to be run by different users on the same machine

Issue 1587: Make CudaAPIError picklable

Issue 1568: Fix using Numba from inside Visual Studio 2015

Issue 1559: Fix serializing a jit function referring a renamed module

PR 1508: Let reshape() accept integer argument(s), not just a tuple

Issue 1545: Improve error checking when unboxing list objects

Issue 1538: Fix array broadcasting in CUDA gufuncs

Issue 1526: Fix a reference count handling bug

0.22.1

This is a bug-fix release to resolve some packaging issues and other problems found in the 0.22.0 release.

Fixes:

PR 1515: Include MANIFEST.in in MANIFEST.in so that sdist still works from source tar files.

PR 1518: Fix reference counting bug caused by hidden alias

PR 1519: Fix erroneous assert when passing nopython=True to guvectorize.

PR 1521: Fix cuda.test()

0.22.0

This release features several highlights: Python 3.5 support, Numpy 1.10 support, Ahead-of-Time compilation of extension modules, additional vectorization features that were previously only available with the proprietary extension NumbaPro, improvements in array indexing.

Improvements:

PR 1497: Allow scalar input type instead of size-1 array to guvectorize

PR 1480: Add distutils support for AOT compilation

PR 1460: Create a new API for Ahead-of-Time (AOT) compilation

PR 1451: Allow passing Python lists to JIT-compiled functions, and reflect mutations on function return

PR 1387: Numpy 1.10 support

PR 1464: Support cffi.FFI.from_buffer()

PR 1437: Propagate errors raised from Numba-compiled ufuncs; also, let "division by zero" and other math errors produce a warning instead of exiting the function early

PR 1445: Support a subset of fancy indexing

PR 1454: Support "out-of-line" CFFI modules

PR 1442: Improve array indexing to support more kinds of basic slicing

PR 1409: Support explicit CUDA memory fences

PR 1435: Add support for vectorize() and guvectorize() with HSA

PR 1432: Implement numpy.nonzero() and numpy.where()

PR 1416: Add support for vectorize() and guvectorize() with CUDA, as originally provided in NumbaPro

PR 1424: Support in-place array operators

PR 1414: Python 3.5 support

PR 1404: Add the parallel ufunc functionality originally provided in NumbaPro

PR 1393: Implement sorting on arrays and lists

PR 1415: Add functions to estimate the occupancy of a CUDA kernel

PR 1360: The JIT cache now stores the compiled object code, yielding even larger speedups.

PR 1402: Fixes for the ARMv7 (armv7l) architecture under Linux

PR 1400: Add the cuda.reduce() decorator originally provided in NumbaPro

Fixes:

PR 1483: Allow np.empty_like() and friends on non-contiguous arrays

Issue 1471: Allow caching JIT functions defined in IPython

PR 1457: Fix flat indexing of boolean arrays

PR 1421: Allow calling Numpy ufuncs, without an explicit output, on non-contiguous arrays

Issue 1411: Fix crash when unpacking a tuple containing a Numba-allocated array

Issue 1394: Allow unifying range_state32 and range_state64

Issue 1373: Fix code generation error on lists of bools

0.21.0

This release introduces support for AMD's Heterogeneous System Architecture, which allows memory to be shared directly between the CPU and the GPU. Other major enhancements are support for lists and the introduction of an opt-in compilation cache.

Improvements:

PR 1391: Implement print() for CUDA code

PR 1366: Implement integer typing enhancement proposal (NBEP 1)

PR 1380: Support the one-argument type() builtin

PR 1375: Allow boolean evaluation of lists and tuples

PR 1371: Support array.view() in CUDA mode

PR 1369: Support named tuples in nopython mode

PR 1250: Implement numpy.median().

PR 1289: Make dispatching faster when calling a JIT-compiled function from regular Python

Issue 1226: Improve performance of integer power

PR 1321: Document features supported with CUDA

PR 1345: HSA support

PR 1343: Support lists in nopython mode

PR 1356: Make Numba-allocated memory visible to tracemalloc

PR 1363: Add an environment variable NUMBA_DEBUG_TYPEINFER

PR 1051: Add an opt-in, per-function compilation cache

Fixes:

Issue 1372: Some array expressions would fail rewriting when involved the same variable more than once, or a unary operator

Issue 1385: Allow CUDA local arrays to be declared anywhere in a function

Issue 1285: Support datetime64 and timedelta64 in Numpy reduction functions

Issue 1332: Handle the EXTENDED_ARG opcode.

PR 1329: Handle the in operator in object mode

Issue 1322: Fix augmented slice assignment on Python 2

PR 1357: Fix slicing with some negative bounds or step values.

0.20.0

This release updates Numba to use LLVM 3.6 and CUDA 7 for CUDA support. Following the platform deprecation in CUDA 7, Numba's CUDA feature is no longer supported on 32-bit platforms. The oldest supported version of Windows is Windows 7.

Improvements:

Issue 1203: Support indexing ndarray.flat

PR 1200: Migrate cgutils to llvmlite

PR 1190: Support more array methods: .transpose(), .T, .copy(), .reshape(), .view()

PR 1214: Simplify setup.py and avoid manual maintenance

PR 1217: Support datetime64 and timedelta64 constants

PR 1236: Reload environment variables when compiling

PR 1225: Various speed improvements in generated code

PR 1252: Support cmath module in CUDA

PR 1238: Use 32-byte aligned allocator to optimize for AVX

PR 1258: Support numpy.frombuffer()

PR 1274: Use TravisCI container infrastructure for lower wait time

PR 1279: Micro-optimize overload resolution in call dispatch

Issue 1248: Improve error message when return type unification fails

Fixes:

Issue 1131: Handling of negative zeros in np.conjugate() and np.arccos()

Issue 1188: Fix slow array return

Issue 1164: Avoid warnings from CUDA context at shutdown

Issue 1229: Respect the writeable flag in arrays

Issue 1244: Fix bug in refcount pruning pass

Issue 1251: Fix partial left-indexing of Fortran contiguous array

Issue 1264: Fix compilation error in array expression

Issue 1254: Fix error when yielding array objects

Issue 1276: Fix nested generator use

0.19.2

This release fixes the source distribution on pypi. The only change is in the setup.py file. We do not plan to provide a conda package as this release is essentially the same as 0.19.1 for conda users.

0.19.1

Issue 1196:

fix double-free segfault due to redundant variable deletion in the Numba IR (1195)

fix use-after-delete in array expression rewrite pass

0.19.0

This version introduces memory management in the Numba runtime, allowing to allocate new arrays inside Numba-compiled functions. There is also a rework of the ufunc infrastructure, and an optimization pass to collapse cascading array operations into a single efficient loop.

.. warning:: Support for Windows XP and Vista with all compiler targets and support for 32-bit platforms (Win/Mac/Linux) with the CUDA compiler target are deprecated. In the next release of Numba, the oldest version of Windows supported will be Windows 7. CPU compilation will remain supported on 32-bit Linux and Windows platforms.

Known issues:

There are some performance regressions in very short running nopython functions due to the additional overhead incurred by memory management. We will work to reduce this overhead in future releases.

Features:

Issue 1181: Add a Frequently Asked Questions section to the documentation.

Issue 1162: Support the cumsum() and cumprod() methods on Numpy arrays.

Issue 1152: Support the *args argument-passing style.

Issue 1147: Allow passing character sequences as arguments to JIT-compiled functions.

Issue 1110: Shortcut deforestation and loop fusion for array expressions.

Issue 1136: Support various Numpy array constructors, for example numpy.zeros() and numpy.zeros_like().

Issue 1127: Add a CUDA simulator running on the CPU, enabled with the NUMBA_ENABLE_CUDASIM environment variable.

Issue 1086: Allow calling standard Numpy ufuncs without an explicit output array from nopython functions.

Issue 1113: Support keyword arguments when calling numpy.empty() and related functions.

Issue 1108: Support the ctypes.data attribute of Numpy arrays.

Issue 1077: Memory management for array allocations in nopython mode.

Issue 1105: Support calling a ctypes function that takes ctypes.py_object parameters.

Issue 1084: Environment variable NUMBA_DISABLE_JIT disables compilation of jit functions, instead calling into the Python interpreter when called. This allows easier debugging of multiple jitted functions.

Issue 927: Allow gufuncs with no output array.

Issue 1097: Support comparisons between tuples.

Issue 1075: Numba-generated ufuncs can now be called from nopython functions.

Issue 1062: vectorize now allows omitting the signatures, and will compile the required specializations on the fly (like jit does).

Issue 1027: Support numpy.round().

Issue 1085: Allow returning a character sequence (as fetched from a structured array) from a JIT-compiled function.

Fixes:

Issue 1170: Ensure ndindex(), ndenumerate() and ndarray.flat work properly inside generators.

Issue 1151: Disallow unpacking of tuples with the wrong size.

Issue 1141: Specify install dependencies in setup.py.

Issue 1106: Loop-lifting would fail when the lifted loop does not produce any output values for the function tail.

Issue 1103: Fix mishandling of some inputs when a JIT-compiled function is called with multiple array layouts.

Issue 1089: Fix range() with large unsigned integers.

Issue 1088: Install entry-point scripts (numba, pycc) from the conda build recipe.

Issue 1081: Constant structured scalars now work properly.

Issue 1080: Fix automatic promotion of booleans to integers.

0.18.2

Bug fixes:

Issue 1073: Fixes missing template file for HTML annotation

Issue 1074: Fixes CUDA support on Windows machine due to NVVM API mismatch

0.18.1

Version 0.18.0 is not officially released.

This version removes the old deprecated and undocumented argtypes and restype arguments to the jit decorator. Function signatures should always be passed as the first argument to jit.

Features:

Issue 960: Add inspect_llvm() and inspect_asm() methods to JIT-compiled functions: they output the LLVM IR and the native assembler source of the compiled function, respectively.

Issue 990: Allow passing tuples as arguments to JIT-compiled functions in nopython mode.

Issue 774: Support two-argument round() in nopython mode.

Issue 987: Support missing functions from the math module in nopython mode: frexp(), ldexp(), gamma(), lgamma(), erf(), erfc().

Issue 995: Improve code generation for round() on Python 3.

Issue 981: Support functions from the random and numpy.random modules in nopython mode.

Issue 979: Add cuda.atomic.max().

Issue 1006: Improve exception raising and reporting. It is now allowed to raise an exception with an error message in nopython mode.

Issue 821: Allow ctypes- and cffi-defined functions as arguments to nopython functions.

Issue 901: Allow multiple explicit signatures with jit. The signatures must be passed in a list, as with vectorize.

Issue 884: Better error message when a JIT-compiled function is called with the wrong types.

Issue 1010: Simpler and faster CUDA argument marshalling thanks to a refactoring of the data model.

Issue 1018: Support arrays of scalars inside Numpy structured types.

Issue 808: Reduce Numba import time by half.

Issue 1021: Support the buffer protocol in nopython mode. Buffer-providing objects, such as bytearray, array.array or memoryview support array-like operations such as indexing and iterating. Furthermore, some standard attributes on the memoryview object are supported.

Issue 1030: Support nested arrays in Numpy structured arrays.

Issue 1033: Implement the inspect_types(), inspect_llvm() and inspect_asm() methods for CUDA kernels.

Issue 1029: Support Numpy structured arrays with CUDA as well.

Issue 1034: Support for generators in nopython and object mode.

Issue 1044: Support default argument values when calling Numba-compiled functions.

Issue 1048: Allow calling Numpy scalar constructors from CUDA functions.

Issue 1047: Allow indexing a multi-dimensional array with a single integer, to take a view.

Issue 1050: Support len() on tuples.

Issue 1011: Revive HTML annotation.

Fixes:

Issue 977: Assignment optimization was too aggressive.

Issue 561: One-argument round() now returns an int on Python 3.

Issue 1001: Fix an unlikely bug where two closures with the same name and id() would compile to the same LLVM function name, despite different closure values.

Issue 1006: Fix reference leak when a JIT-compiled function is disposed of.

Issue 1017: Update instructions for CUDA in the README.

Issue 1008: Generate shorter LLVM type names to avoid segfaults with CUDA.

Issue 1005: Properly clean up references when raising an exception from object mode.

Issue 1041: Fix incompatibility between Numba and the third-party library "future".

Issue 1053: Fix the size attribute of CUDA shared arrays.

0.18.0

This is a minor release that fixes several issues (263, 262, 258, 237) with the wheel build. In addition, we have minor fixes for running on PPC64LE platforms (261). And, we added CI testing against PyPy (253).

0.17.1

This is a bugfix release that addresses issue 258 that our LLVM binding shared library is missing from the wheel builds.

0.17.0

The major focus in this release has been a rewrite of the documentation. The new documentation is better structured and has more detailed coverage of Numba features and APIs. It can be found online at http://numba.pydata.org/numba-doc/dev/index.html

Features:

Issue 895: LLVM can now inline nested function calls in nopython mode.

Issue 863: CUDA kernels can now infer the types of their arguments ("autojit"-like).

Issue 833: Support numpy.{min,max,argmin,argmax,sum,mean,var,std} in nopython mode.

Issue 905: Add a nogil argument to the jit decorator, to release the GIL in nopython mode.

Issue 829: Add a identity argument to vectorize and guvectorize, to set the identity value of the ufunc.

Issue 843: Allow indexing 0-d arrays with the empty tuple.

Issue 933: Allow named arguments, not only positional arguments, when calling a Numba-compiled function.

Issue 902: Support numpy.ndenumerate() in nopython mode.

Issue 950: AVX is now enabled by default except on Sandy Bridge and Ivy Bridge CPUs, where it can produce slower code than SSE.

Issue 956: Support constant arrays of structured type.

Issue 959: Indexing arrays with floating-point numbers isn't allowed anymore.

Issue 955: Add support for 3D CUDA grids and thread blocks.

Issue 902: Support numpy.ndindex() in nopython mode.

Issue 951: Numpy number types (numpy.int8, etc.) can be used as constructors for type conversion in nopython mode.

Fixes:

Issue 889: Fix NUMBA_DUMP_ASSEMBLY for the CUDA backend.

Issue 903: Fix calling of stdcall functions with ctypes under Windows.

Issue 908: Allow lazy-compiling from several threads at once.

Issue 868: Wrong error message when multiplying a scalar by a non-scalar.

Issue 917: Allow vectorizing with datetime64 and timedelta64 in the signature (only with unit-less values, though, because of a Numpy limitation).

Issue 431: Allow overloading of cuda device function.

Issue 917: Print out errors occurred in object mode ufuncs.

Issue 923: Numba-compiled ufuncs now inherit the name and doc of the original Python function.

Issue 928: Fix boolean return value in nested calls.

Issue 915: jit called with an explicit signature with a mismatching type of arguments now raises an error.

Issue 784: Fix the truth value of NaNs.

Issue 953: Fix using shared memory in more than one function (kernel or device).

Issue 970: Fix an uncommon double to uint64 conversion bug on CentOS5 32-bit (C compiler issue).

0.16.0

This release contains a major refactor to switch from llvmpy to llvmlite <https://github.com/numba/llvmlite>_ as our code generation backend. The switch is necessary to reconcile different compiler requirements for LLVM 3.5 (needs C++11) and Python extensions (need specific compiler versions on Windows). As a bonus, we have found the use of llvmlite speeds up compilation by a factor of 2!

Other Major Changes:

Faster dispatch for numpy structured arrays

Optimized array.flat()

Improved CPU feature selection

Fix constant tuple regression in macro expansion code

Known Issues:

AVX code generation is still disabled by default due to performance regressions when operating on misaligned NumPy arrays. We hope to have a workaround in the future.

In extremely rare circumstances, a known issue with LLVM 3.5 <http://llvm.org/bugs/show_bug.cgi?id=21423>_ code generation can cause an ELF relocation error on 64-bit Linux systems.

0.15.1

(This was a bug-fix release that superceded version 0.15 before it was announced.)

Fixes:

Workaround for missing __ftol2 on Windows XP.

Do not lift loops for compilation that contain break statements.

Fix a bug in loop-lifting when multiple values need to be returned to the enclosing scope.

Handle the loop-lifting case where an accumulator needs to be updated when the loop count is zero.

0.15

Features:

Support for the Python cmath module. (NumPy complex functions were already supported.)

Support for .real, .imag, and `.conjugate()`` on non-complex numbers.

Add support for math.isfinite() and math.copysign().

Compatibility mode: If enabled (off by default), a failure to compile in object mode will fall back to using the pure Python implementation of the function.

Experimental support for serializing JIT functions with cloudpickle.

Loop-jitting in object mode now works with loops that modify scalars that are accessed after the loop, such as accumulators.

vectorize functions can be compiled in object mode.

Numba can now be built using the Visual C++ Compiler for Python 2.7 <http://aka.ms/vcpython27>_ on Windows platforms.

CUDA JIT functions can be returned by factory functions with variables in the closure frozen as constants.

Support for "optional" types in nopython mode, which allow None to be a valid value.

Fixes:

If nopython mode compilation fails for any reason, automatically fall back to object mode (unless nopython=True is passed to jit) rather than raise an exeception.

Allow function objects to be returned from a function compiled in object mode.

Fix a linking problem that caused slower platform math functions (such as exp()) to be used on Windows, leading to performance regressions against NumPy.

min() and max() no longer accept scalars arguments in nopython mode.

Fix handling of ambigous type promotion among several compiled versions of a JIT function. The dispatcher will now compile a new version to resolve the problem. (issue 776)

Fix float32 to uint64 casting bug on 32-bit Linux.

Fix type inference to allow forced casting of return types.

Allow the shape of a 1D cuda.shared.array and cuda.local.array to be a one-element tuple.

More correct handling of signed zeros.

Add custom implementation of atan2() on Windows to handle special cases properly.

Eliminated race condition in the handling of the pagelocked staging area used when transferring CUDA arrays.

Fix non-deterministic type unification leading to varying performance. (issue 797)

0.15.0

Enhancements:

PR 213: Add partial LLVM bindings for ObjectFile.

PR 215: Add inline assembly helpers in the builder.

PR 216: Allow specifying alignment in alloca instructions.

PR 219: Remove unnecessary verify in module linkage.

Fixes:

PR 209, Issue 208: Fix overly restrictive test for library filenames.

0.14

Features:

Support for nearly all the Numpy math functions (including comparison, logical, bitwise and some previously missing float functions) in nopython mode.

The Numpy datetime64 and timedelta64 dtypes are supported in nopython mode with Numpy 1.7 and later.

Support for Numpy math functions on complex numbers in nopython mode.

ndarray.sum() is supported in nopython mode.

Better error messages when unsupported types are used in Numpy math functions.

Set NUMBA_WARNINGS=1 in the environment to see which functions are compiled in object mode vs. nopython mode.

Add support for the two-argument pow() builtin function in nopython mode.

New developer documentation describing how Numba works, and how to add new types.

Support for Numpy record arrays on the GPU. (Note: Improper alignment of dtype fields will cause an exception to be raised.)

Slices on GPU device arrays.

GPU objects can be used as Python context managers to select the active device in a block.

GPU device arrays can be bound to a CUDA stream. All subsequent operations (such as memory copies) will be queued on that stream instead of the default. This can prevent unnecessary synchronization with other streams.

Fixes:

Generation of AVX instructions has been disabled to avoid performance bugs when calling external math functions that may use SSE instructions, especially on OS X.

JIT functions can be removed by the garbage collector when they are no longer accessible.

Various other reference counting fixes to prevent memory leaks.

Fixed handling of exception when input argument is out of range.

Prevent autojit functions from making unsafe numeric conversions when called with different numeric types.

Fix a compilation error when an unhashable global value is accessed.

Gracefully handle failure to enable faulthandler in the IPython Notebook.

Fix a bug that caused loop lifting to fail if the loop was inside an else block.

Fixed a problem with selecting CUDA devices in multithreaded programs on Linux.

The pow() function (and ** operation) applied to two integers now returns an integer rather than a float.

Numpy arrays using the object dtype no longer cause an exception in the autojit.

Attempts to write to a global array will cause compilation to fall back to object mode, rather than attempt and fail at nopython mode.

range() works with all negative arguments (ex: range(-10, -12, -1))

0.14.0

Enhancements:

PR 104: Add binding to get and view function control-flow graph.

PR 210: Improve llvmdev recipe.

PR 212: Add initializer for the native assembly parser.

0.13.4

Features:

Setting and deleting attributes in object mode

Added documentation of supported and currently unsupported numpy ufuncs

Assignment to 1-D numpy array slices

Closure variables and functions can be used in object mode

All numeric global values in modules can be used as constants in JIT compiled code

Support for the start argument in enumerate()

Inplace arithmetic operations (+=, -=, etc.)

Direct iteration over a 1D numpy array (e.g. "for x in array: ...") in nopython mode

Fixes:

Support for NVIDIA compute capability 5.0 devices (such as the GTX 750)

Vectorize no longer crashes/gives an error when bool_ is used as return type

Return the correct dictionary when globals() is used in JIT functions

Fix crash bug when creating dictionary literals in object

Report more informative error message on import if llvmpy is too old

Temporarily disable pycc --header, which generates incorrect function signatures.

0.13.3

Features:

Support for enumerate() and zip() in nopython mode

Increased LLVM optimization of JIT functions to -O1, enabling automatic vectorization of compiled code in some cases

Iteration over tuples and unpacking of tuples in nopython mode

Support for dict and set (Python >= 2.7) literals in object mode

Fixes:

JIT functions have the same name and doc as the original function.

Numerous improvements to better match the data types and behavior of Python math functions in JIT compiled code on different platforms.

Importing Numba will no longer throw an exception if the CUDA driver is present, but cannot be initialized.

guvectorize now properly supports functions with scalar arguments.

CUDA driver is lazily initialized

0.13.2

Features:

vectorize ufunc now can generate SIMD fast path for unit strided array

Added cuda.gridsize

Added preliminary exception handling (raise exception class)

Fixes:

UNARY_POSITIVE

Handling of closures and dynamically generated functions

Global None value

0.13.1

Features:

Initial support for CUDA array slicing

Fixes:

Indirectly fixes numbapro when the system has a incompatible CUDA driver

Fix numba.cuda.detect

Export numba.intp and numba.intc

0.13

Features:

Opensourcing NumbaPro CUDA python support in numba.cuda

Add support for ufunc array broadcasting

Add support for mixed input types for ufuncs

Add support for returning tuple from jitted function

Fixes:

Fix store slice bytecode handling for Python2

Fix inplace subtract

Fix pycc so that correct header is emitted

Allow vectorize to work on functions with jit decorator

0.13.0

Enhancements:

PR 176: Switch from LLVM 3.7 to LLVM 3.8.

PR 191: Allow setting the alignment of a global variable.

PR 198: Add missing function attributes.

PR 160: Escape the contents of metadata strings, to allow embedding any characters.

PR 162: Add support for creating debug information nodes.

PR 200: Improve the usability of metadata emission APIs.

PR 200: Allow calling functions with metadata arguments (such as llvm.dbg.declare).

Fixes:

PR 190: Suppress optimization remarks printed out in some cases by LLVM.

PR 200: Allow attaching metadata to a ret instruction.

0.12.2

Fixes:

Improved NumPy ufunc support in nopython mode

Misc bug fixes

0.12.1

This version fixed many regressions reported by user for the 0.12 release. This release contains a new loop-lifting mechanism that specializes certains loop patterns for nopython mode compilation. This avoid direct support for heap-allocating and other very dynamic operations.

Improvements:

Add loop-lifting--jit-ing loops in nopython for object mode code. This allows functions to allocate NumPy arrays and use Python objects, while the tight loops in the function can still be compiled in nopython mode. Any arrays that the tight loop uses should be created before the loop is entered.

Fixes:

Add support for majority of "math" module functions

Fix for...else handling

Add support for builtin round()

Fix tenary if...else support

Revive "numba" script

Fix problems with some boolean expressions

Add support for more NumPy ufuncs

0.12

this refactor was to simplify the code base to create a better foundation for further work. A secondary objective was to improve the worst case performance to ensure that compiled functions in object mode never run slower than pure Python code (this was a problem in several cases with the old c

quanted / qed

Scheduled daily dependency update on thursday #90

Updates

Changelogs

celery 4.0.2 -> 4.1.0

4.1.0

numba -> 0.34.0

0.34.0

0.33.0

0.32.0

0.31.0

0.30.1

0.30.0

0.29.0

0.28.1

0.28.0

0.27.0

0.26.0

0.25.0

0.24.0

0.23.1

0.23.0

0.22.1

0.22.0

0.21.0

0.20.0

0.19.2

0.19.1

0.19.0

0.18.2

0.18.1

0.18.0

0.17.1

0.17.0

0.16.0

0.15.1

0.15

0.15.0

0.14

0.14.0

0.13.4

0.13.3

0.13.2

0.13.1

0.13

0.13.0

0.12.2

0.12.1

0.12