JerBouma / FinanceToolkit

Transparent and Efficient Financial Analysis
https://www.jeroenbouma.com/projects/financetoolkit
MIT License
2.56k stars 319 forks source link

get_historical_data(fill_nan=False) still triggers forward fill #111

Closed sword134 closed 5 months ago

sword134 commented 5 months ago

Consider the following code:

symbols = ["ACIW", "ACVA", "BBBY"]
companies = Toolkit(symbols, api_key=API_KEY, quarterly=True, start_date=start_date, sleep_timer=True, progress_bar=True, remove_invalid_tickers=False)
df_stock_price = companies.get_historical_data(fill_nan=False, period="daily")
print(df_stock_price)

This code will result in BBBY OHLC data to be forward filled despite fill_nan=False. Adding a print("Forward filled") at the code that handles forward filling of the get_historical_data() will also result in the terminal printing.

JerBouma commented 5 months ago

Will look into it this week!

JerBouma commented 5 months ago

Hi, I can not reproduce this issue. The reason you get the print-statement is because it acquired the risk-free rate which I forgot to set the fill_nan parameter, that is fixed. Can you show me in the data where it clearly filled days for both? My comparison here shows it clearly filled the NaN values and in the other case it didn't.

Perhaps you tried to acquire the data again and didn't set overwrite=True?

image
sword134 commented 5 months ago

Hi, I can not reproduce this issue. The reason you get the print-statement is because it acquired the risk-free rate which I forgot to set the fill_nan parameter, that is fixed. Can you show me in the data where it clearly filled days for both? My comparison here shows it clearly filled the NaN values and in the other case it didn't.

image

The only "extra" i have in my code is this:

symbols = ["ACIW", "ACVA", "BBBY"]

companies = Toolkit(symbols, api_key=API_KEY, quarterly=True, start_date=start_date, sleep_timer=True, progress_bar=True, remove_invalid_tickers=False)

custom_ratios = {
    "WC / Net Income as %": "(Working Capital / Net Income) * 100",
    "Working Capital Ratio": "Working Capital / Revenue",
}

df_ratios = companies.ratios.collect_all_ratios()
df_custom = companies.ratios.collect_custom_ratios(custom_ratios_dict=custom_ratios)
df_stats = companies.get_statistics_statement()

df_stock_price = companies.get_historical_data(fill_nan=False, period="daily")
df_ratios_growth_TTM = companies.ratios.collect_all_ratios(growth=True, lag=[1, 2, 3, 4, 5], trailing=4)

If I copy paste this code into my IDE and run it, it will result in BBBY filling all the data from the may 2023 until today with the latest observed closing price

JerBouma commented 5 months ago

If you don't want this functionality, you'll have to run the get_historical_data(period="quarterly", fill_nan=False) functionality beforehand. The primary purpose of fill_nan is to fix holes in the data on off-trading days. I'll probably stricken the intensity a bit though to not forward fill values if the next value is also NaN.

JerBouma commented 5 months ago

This issue has been fixed in v1.8.2, it now interpolates and doesn't forward fill.

image

As you can see BBBY now stops early indicating no forward filling happening.

image