Heerozh / spectre

GPU-accelerated Factors analysis library and Backtester
GNU General Public License v3.0
627 stars 108 forks source link

Updating library to be compatible with latest pandas #18

Closed FilipBolt closed 9 months ago

FilipBolt commented 9 months ago
  1. Updating to ffill beforehand on entire dataset, instead of fetching due to pandas interface change.
  2. No longer slicing using sets, but casting to list3
  3. Minor typos

This should help resolve issues:

  1. https://github.com/Heerozh/spectre/issues/15#issue-1001114730
  2. https://github.com/Heerozh/spectre/issues/14

Confirming that the following dataset downloaded with the yfinance framework now works with this library backtesting examples:

import yfinance as yf

def download_stock_data(
    ticker, start_date, end_date, granularity
):
    # Download daily OHLC, volume, dividends, stock splits, and other data
    stock_data = yf.download(
        ticker, start=start_date, end=end_date, actions=True,
        interval=granularity
    )

    # Calculate the 5-day moving average
    stock_data['MA5'] = stock_data['Close'].rolling(window=5).mean()

    # Keep the relevant columns
    stock_data = stock_data[[
        'Open', 'High', 'Low', 'Close',
        'Volume', 'Dividends', 'Stock Splits', 'MA5'
    ]]

    # Extract ex-dividend date from the dividends column
    stock_data['Ex-Dividend'] = stock_data['Dividends'].shift(1)

    # Extract split ratio from the stock splits column
    stock_data['Split Ratio'] = stock_data['Stock Splits'].apply(
        lambda x: 1 / x if x != 0 else 1
    )

    # Drop rows with NaN values
    stock_data = stock_data.dropna()

    # Add a column for the asset (use the ticker symbol as an example)
    stock_data['Asset'] = ticker

    # rename columns to lowercase
    stock_data.columns = stock_data.columns.str.lower()
    # rename to snake case
    stock_data.columns = stock_data.columns.str.replace(' ', '_')

    stock_data = stock_data[[
        'open', 'high', 'low', 'close',
        'volume', 'ex-dividend', 'split_ratio', 'ma5',
        'asset'
    ]]
    stock_data = stock_data.resample(granularity.upper()).ffill()
    stock_data['date'] = stock_data.index
    return stock_data
Heerozh commented 9 months ago

great works index.get_indexer([start], 'bfill')[0] would be better? so the start, end date can be a holiday

FilipBolt commented 9 months ago

Sure, index.get_indexer works as well.

Confirming that I've tried with specifying both start and end date as holidays.

Heerozh commented 9 months ago

confirmed, thanks for your pull request.

About YahooDownloader class, it would indeed be better to change to yfinance. your code above is a good start, if you want, you can just rewrite that class.

15 I don’t know where to get a free ticker list now, this may be a problem, or just remove it, let the user pass in the tickers list.