jealous / stockstats

Supply a wrapper ``StockDataFrame`` based on the ``pandas.DataFrame`` with inline stock statistics/indicators support.
Other
1.31k stars 299 forks source link

Stock Statistics/Indicators Calculation Helper

build & test codecov pypi

VERSION: 0.6.2

Introduction

Supply a wrapper StockDataFrame for pandas.DataFrame with inline stock statistics/indicators support.

Supported statistics/indicators are:

Installation

pip install stockstats

Compatibility

The build checks the compatibility for the last two major releases of python3 and the last release of python2.

License

BSD-3-Clause License

Tutorial

Initialization

StockDataFrame works as a wrapper for the pandas.DataFrame. You need to Initialize the StockDataFrame with wrap or StockDataFrame.retype.

import pandas as pd
from stockstats import wrap

data = pd.read_csv('stock.csv')
df = wrap(data)

Formalize your data. This package takes for granted that your data is sorted by timestamp and contains certain columns. Please align your column name.

Note these column names are case-insensitive. They are converted to lower case when you wrap the data frame.

By default, the date column is used as the index. Users can also specify the index column name in the wrap or retype function.

Example: DataFrame loaded from CSV.

          Date      Amount  Close   High    Low   Volume
0     20040817  90923240.0  11.20  12.21  11.03  7877900
1     20040818  52955668.0  10.29  10.90  10.29  5043200
2     20040819  32614676.0  10.53  10.65  10.30  3116800
...        ...         ...    ...    ...    ...      ...
2810  20160815  56416636.0  39.58  39.79  38.38  1436706
2811  20160816  68030472.0  39.66  40.86  39.00  1703600
2812  20160817  62536480.0  40.45  40.59  39.12  1567600

After conversion to StockDataFrame

              amount  close   high    low   volume
date
20040817  90923240.0  11.20  12.21  11.03  7877900
20040818  52955668.0  10.29  10.90  10.29  5043200
20040819  32614676.0  10.53  10.65  10.30  3116800
...              ...    ...    ...    ...      ...
20160815  56416636.0  39.58  39.79  38.38  1436706
20160816  68030472.0  39.66  40.86  39.00  1703600
20160817  62536480.0  40.45  40.59  39.12  1567600 

Use unwrap to convert it back to a pandas.DataFrame. Note that unwrap won't reset the columns and the index.

Access the Data

StockDataFrame is a subclass of pandas.DataFrame. All the functions of pandas.DataFrame should work the same as before.

Retrieve the data with symbol

We allow the user to access the statistics directly with some specified column name, such as kdjk, macd, rsi.

The values of these columns are calculated the first time you access them from the data frame. Please delete those columns first if you want the lib to re-evaluate them.

Retrieve the Series

Use macd = stock['macd'] or rsi = stock.get('rsi') to retrieve the Series.

Retrieve the symbol with 2 arguments

Some statistics need the column name and the window size, such as delta, shift, simple moving average, etc. Use this patter to retrieve them: <columnName>_<windowSize>_<statistics>

Examples:

Retrieve the symbol with 1 argument

Some statistics require the window size but not the column name. Use this patter to specify your window: <statistics>_<windowSize>

Examples:

Some of them have default windows. Check their document for detail.

Initialize all indicators with shortcuts

Some indicators, such as KDJ, BOLL, MFI, have shortcuts. Use df.init_all() to initialize all these indicators.

This operation generates lots of columns. Please use it with caution.

Statistics/Indicators

Some statistics have configurable parameters. They are class-level fields. Change of these fields is global. And they won't affect the existing results. Removing existing columns so that they will be re-evaluated the next time you access them.

Delta of Periods

Using pattern <column>_<window>_d to retrieve the delta between different periods.

You can also use <column>_delta as a shortcut to <column>_-1_d

Examples:

Shift Periods

Shift the column backward or forward. It takes 2 parameters:

We fill the head and tail with the nearest data.

See the example below:

In [15]: df[['close', 'close_-1_s', 'close_2_s']]
Out[15]:
          close  close_-1_s  close_2_s
date
20040817  11.20       11.20      10.53
20040818  10.29       11.20      10.55
20040819  10.53       10.29      10.10
20040820  10.55       10.53      10.25
...         ...         ...        ...
20160812  39.10       38.70      39.66
20160815  39.58       39.10      40.45
20160816  39.66       39.58      40.45
20160817  40.45       39.66      40.45

[2813 rows x 3 columns]

RSI - Relative Strength Index

RSI has a configurable window. The default window size is 14 which is configurable through set_dft_window('rsi', n). e.g.

Log Return of the Close

Logarithmic return = ln( close / last close)

From wiki:

For example, if a stock is priced at 3.570 USD per share at the close on one day, and at 3.575 USD per share at the close the next day, then the logarithmic return is: ln(3.575/3.570) = 0.0014, or 0.14%.

Use df['log-ret'] to access this column.

Count of Non-Zero Value

Count non-zero values of a specific range. It requires a column and a window.

Examples:

In [22]: tp = df['middle']                             

In [23]: df['res'] = df['middle'] > df['close']        

In [24]: df[['middle', 'close', 'res', 'res_10_c']]    
Out[24]:                                               
             middle  close    res  res_10_c            
date                                                   
20040817  11.480000  11.20   True       1.0            
20040818  10.493333  10.29   True       2.0            
20040819  10.493333  10.53  False       2.0            
20040820  10.486667  10.55  False       2.0            
20040823  10.163333  10.10   True       3.0            
...             ...    ...    ...       ...            
20160811  38.703333  38.70   True       5.0            
20160812  38.916667  39.10  False       5.0            
20160815  39.250000  39.58  False       4.0            
20160816  39.840000  39.66   True       5.0            
20160817  40.053333  40.45  False       5.0            

[2813 rows x 4 columns]                                
In [26]: df['ups'], df['downs'] = df['change'] > 0, df['change'] < 0 

In [27]: df[['ups', 'ups_10_c', 'downs', 'downs_10_c']]              
Out[27]:                                                             
            ups  ups_10_c  downs  downs_10_c                         
date                                                                 
20040817  False       0.0  False         0.0                         
20040818  False       0.0   True         1.0                         
20040819   True       1.0  False         1.0                         
20040820   True       2.0  False         1.0                         
20040823  False       2.0   True         2.0                         
...         ...       ...    ...         ...                         
20160811  False       3.0   True         7.0                         
20160812   True       3.0  False         7.0                         
20160815   True       4.0  False         6.0                         
20160816   True       5.0  False         5.0                         
20160817   True       5.0  False         5.0                         

[2813 rows x 4 columns]                                              

Max and Min of the Periods

Retrieve the max/min value of specified periods. They require column and window.
Note the window does NOT simply stand for the rolling window.

Examples:

RSV - Raw Stochastic Value

RSV is essential for calculating KDJ. It takes a window parameter. Use df['rsv'] or df['rsv_6'] to access it.

RSI - Relative Strength Index

RSI chart the current and historical strength or weakness of a stock. It takes a window parameter.

The default window is 14. Use set_dft_window('rsi', n) to tune it.

Examples:

Stochastic RSI

Stochastic RSI gives traders an idea of whether the current RSI value is overbought or oversold. It takes a window parameter.

The default window is 14. Use set_dft_window('stochrsi', n) to tune it.

Examples:

WT - Wave Trend

Retrieve the LazyBear's Wave Trend with df['wt1'] and df['wt2'].

Wave trend uses two parameters. You can tune them with set_dft_window('wt', (10, 21)).

SMMA - Smoothed Moving Average

It requires column and window.

For example, use df['close_7_smma'] to retrieve the 7 periods smoothed moving average of the close price.

ROC - Rate of Change

The Price Rate of Change (ROC) is a momentum-based technical indicator that measures the percentage change in price between the current price and the price a certain number of periods ago.

Formular:

ROC = (PriceP - PricePn) / PricePn * 100

Where:

You need a column name and a period to calculate ROC.

Examples:

MAD - Mean Absolute Deviation

The mean absolute deviation of a dataset is the average distance between each data point and the mean. It gives us an idea about the variability in a dataset.

Formular:

  1. Calculate the mean.
  2. Calculate how far away each data point is from the mean using positive distances. These are called absolute deviations.
  3. Add those deviations together.
  4. Divide the sum by the number of data points.

Example:

TRIX - Triple Exponential Average

The triple exponential average is used to identify oversold and overbought markets.

The algorithm is:

TRIX = (TripleEMA - LastTripleEMA) -  * 100 / LastTripleEMA
TripleEMA = EMA of EMA of EMA
LastTripleEMA =  TripleEMA of the last period

It requires column and window. By default, the column is close, the window is 12.

Use set_dft_window('trix', n) to change the default window.

Examples:

TEMA - Another Triple Exponential Average

Tema is another implementation for the triple exponential moving average.

TEMA=(3 x EMA) - (3 x EMA of EMA) + (EMA of EMA of EMA)

It takes two parameters, column and window. By default, the column is close, the window is 5.

Use set_dft_window('tema', n) to change the default window.

Examples:

VR - Volume Variation Index

It is the strength index of the trading volume.

It has a default window of 26. Change it with set_dft_window('vr', n).

Examples:

WR - Williams Overbought/Oversold Index

Williams Overbought/Oversold index is a type of momentum indicator that moves between 0 and -100 and measures overbought and oversold levels.

It takes a window parameter. The default window is 14. Use set_dft_window('wr', n) to change the default window.

Examples:

CCI - Commodity Channel Index

CCI stands for Commodity Channel Index.

It requires a window parameter. The default window is 14. Use set_dft_window('cci', n) to change it.

Examples:

TR - True Range of Trading

TR is a measure of the volatility of a High-Low-Close series. It is used for calculating the ATR.

ATR - Average True Range

The Average True Range is an N-period smoothed moving average (SMMA) of the true range value.
Default to 14 periods.

Users can modify the default window with set_dft_window('atr', n).

Example:

Supertrend

Supertrend indicates the current trend.
We use the algorithm described here. It includes 3 lines:

It has 2 parameters:

DMA - Difference of Moving Average

df['dma'] retrieves the difference of 10 periods SMA of the close price and the 50 periods SMA of the close price.

DMI - Directional Movement Index

The directional movement index (DMI) identifies in which direction the price of an asset is moving.

It has several lines:

It has several parameters.

KDJ Indicator

The stochastic oscillator is a momenxtum indicator that uses support and resistance levels.

It includes three lines:

The default window is 9. Use set_dft_window('kdjk', n) to change it. Use df['kdjk_6'] to retrieve the K series of 6 periods.

KDJ also has two configurable parameters named StockDataFrame.KDJ_PARAM. The default value is (2.0/3.0, 1.0/3.0)

CR - Energy Index

The Energy Index (Intermediate Willingness Index) uses the relationship between the highest price, the lowest price and yesterday's middle price to reflect the market's willingness to buy and sell.

It contains 4 lines:

Typical Price

It's the average of high, low and close. Use df['middle'] to access this value.

When amount is available, middle = amount / volume This should be more accurate because amount represents the total cash flow.

Bollinger Bands

The Bollinger bands includes three lines

The default window of boll is 20. You can also supply your window with df['boll_10']. It will also generate the boll_ub_10 and boll_lb_10 column.

The default period of the Bollinger Band can be changed with set_dft_window('boll', n). The width of the bands can be turned with StockDataFrame.BOLL_STD_TIMES. The default value is 2.

MACD - Moving Average Convergence Divergence

We use the close price to calculate the MACD lines.

The period of short, long EMA and signal line can be tuned with set_dft_window('macd', (short, long, signal)). The default windows are 12 and 26 and 9.

PPO - Percentage Price Oscillator

The Percentage Price Oscillator includes three lines.

The period of short, long EMA and signal line can be tuned with set_dft_window('ppo', (short, long, signal)). The default windows are 12 and 26 and 9.

Simple Moving Average

Follow the pattern <columnName>_<window>_sma to retrieve a simple moving average.

Moving Standard Deviation

Follow the pattern <columnName>_<window>_mstd to retrieve the moving STD.

Moving Variance

Follow the pattern <columnName>_<window>_mvar to retrieve the moving VAR.

Volume Weighted Moving Average

It's the moving average weighted by volume.

It has a parameter for window size. The default window is 14. Change it with set_dft_window('vwma', n).

Examples:

CHOP - Choppiness Index

The Choppiness Index determines if the market is choppy.

It has a parameter for window size. The default window is 14. Change it with set_dft_window('chop', n).

Examples:

MFI - Money Flow Index

The Money Flow Index identifies overbought or oversold signals in an asset.

It has a parameter for window size. The default window is 14. Change it with set_dft_window('mfi', n).

Examples:

ERI - Elder-Ray Index

The Elder-Ray Index contains the bull and the bear power. Both are calculated based on the EMA of the close price.

The default window is 13.

Formular:

Examples:

KER - Kaufman's efficiency ratio

The Efficiency Ratio (ER) is calculated by dividing the price change over a period by the absolute sum of the price movements that occurred to achieve that change.

The resulting ratio ranges between 0 and 1 with higher values representing a more efficient or trending market.

The default column is close.

The default window is 10.

Formular:

Examples:

KAMA - Kaufman's Adaptive Moving Average

Kaufman's Adaptive Moving Average is designed to account for market noise or volatility.

It has 2 optional parameters and 2 required parameters

The default value for window, fast and slow can be configured with set_dft_window('kama', (10, 5, 34))

Examples:

Cross Upwards and Cross Downwards

Use the pattern <A>_xu_<B> to check when A crosses up B.

Use the pattern <A>_xd_<B> to check when A crosses down B.

Use the pattern <A>_x_<B> to check when A crosses B.

Examples:

Aroon Oscillator

The Aroon Oscillator measures the strength of a trend and the likelihood that it will continue.

The default window is 25.

Examples:

Z-Score

Z-score is a statistical measurement that describes a value's relationship to the mean of a group of values.

There is no default column name or window for Z-Score.

The statistical formula for a value's z-score is calculated using the following formula:

z = ( x - μ ) / σ

Where:

Examples:

Awesome Oscillator

The AO indicator is a good indicator for measuring the market dynamics, it reflects specific changes in the driving force of the market, which helps to identify the strength of the trend, including the points of its formation and reversal.

Awesome Oscillator Formula

Examples:

Balance of Power

Balance of Power (BOP) measures the strength of the bulls vs. bears.

Formular:

BOP = (close - open) / (high - low)

Example:

[Chande Momentum Oscillator] (https://www.investopedia.com/terms/c/chandemomentumoscillator.asp)

The Chande Momentum Oscillator (CMO) is a technical momentum indicator developed by Tushar Chande.

The formula calculates the difference between the sum of recent gains and the sum of recent losses and then divides the result by the sum of all price movements over the same period.

The default window is 14.

Formular:

CMO = 100 * ((sH - sL) / (sH + sL))

where:

Examples:

Coppock Curve

Coppock Curve is a momentum indicator that signals long-term trend reversals.

Formular:

Coppock Curve = 10-period WMA of (14-period RoC + 11-period RoC) WMA = Weighted Moving Average RoC = Rate-of-Change

Examples:

Ichimoku Cloud

The Ichimoku Cloud is a collection of technical indicators that show support and resistance levels, as well as momentum and trend direction.

In this implementation, we only calculate the delta between lead A and lead B (which is the width of the cloud).

It contains three windows:

Formular:

Where:

Examples:

Linear Regression Moving Average

Linear regression works by taking various data points in a sample and providing a “best fit” line to match the general trend in the data.

Implementation reference:

https://github.com/twopirllc/pandas-ta/blob/main/pandas_ta/overlap/linreg.py

Examples:

Correlation Trend Indicator

Correlation Trend Indicator is a study that estimates the current direction and strength of a trend.

Implementation is based on the following code:

https://github.com/twopirllc/pandas-ta/blob/main/pandas_ta/momentum/cti.py

Examples:

the Gaussian Fisher Transform Price Reversals indicator

The Gaussian Fisher Transform Price Reversals indicator, dubbed FTR for short, is a stat based price reversal detection indicator inspired by and based on the work of the electrical engineer now private trader John F. Ehlers.

https://www.tradingview.com/script/ajZT2tZo-Gaussian-Fisher-Transform-Price-Reversals-FTR/

Implementation reference:

https://github.com/twopirllc/pandas-ta/blob/084dbe1c4b76082f383fa3029270ea9ac35e4dc7/pandas_ta/momentum/fisher.py#L9

Formular:

Examples:

Relative Vigor Index (RVGI)

The Relative Vigor Index (RVI) is a momentum indicator used in technical analysis that measures the strength of a trend by comparing a security's closing price to its trading range while smoothing the results using a simple moving average (SMA).

Formular

where:

Examples:

Inertia Indicator

In financial markets, the concept of inertia was given by Donald Dorsey in the 1995 issue of Technical Analysis of Stocks and Commodities through the Inertia Indicator. The Inertia Indicator is moment-based and is an extension of Dorsey’s Relative Volatility Index (RVI).

Formular:

Examples:

Know Sure Thing (kst)

The Know Sure Thing (KST) is a momentum oscillator developed by Martin Pring to make rate-of-change readings easier for traders to interpret.

Formular:

Where:

Example:

Pretty Good Oscillator (PGO)

The Pretty Good Oscillator indicator by Mark Johnson measures the distance of the current close from its N-day simple moving average, expressed in terms of an average true range over a similar period.

Formular:

Example:

Psychological Line (PSL)

The Psychological Line indicator is the ratio of the number of rising periods over the total number of periods.

Formular:

Example:

Percentage Volume Oscillator(PVO)

The Percentage Volume Oscillator (PVO) is a momentum oscillator for volume. The PVO measures the difference between two volume-based moving averages as a percentage of the larger moving average.

Formular:

Example:

The period of short, long EMA and signal line can be tuned with set_dft_window('pvo', (short, long, signal)). The default windows are 12 and 26 and 9.

Quantitative Qualitative Estimation(QQE)

The Qualitative Quantitative Estimation (QQE) indicator works like a smoother version of the popular Relative Strength Index (RSI) indicator. QQE expands on RSI by adding two volatility based trailing stop lines. These trailing stop lines are composed of a fast and a slow moving Average True Range (ATR). These ATR lines are smoothed making this indicator less susceptible to short term volatility.

Implementation reference: https://github.com/twopirllc/pandas-ta/blob/main/pandas_ta/momentum/qqe.py

Example:

The period of short, long EMA and signal line can be tuned with set_dft_window('qqe', (rsi, rsi_ma)). The default windows are 14 and 5.

Issues

We use Github Issues to track the issues or bugs.

Others

MACDH Note:

In July 2017 the code for MACDH was changed to drop an extra 2x multiplier on the final value to align better with calculation methods used in tools like cryptowatch, tradingview, etc.

Contact author: