JerBouma / FinanceToolkit

Transparent and Efficient Financial Analysis
https://www.jeroenbouma.com/projects/financetoolkit
MIT License
2.99k stars 360 forks source link

[BUG] Error using convert_currency=True for old financial data #134

Closed LucaColombi closed 6 months ago

LucaColombi commented 7 months ago

What's the feature that should be improved? When I donwload data from an old ticker (for instance PBR or SONY) from a very old time (for instance, 1970-01-01) it gives error if I left the convert_currency to True.

My hipotesys, is that there are not currency data older enough to cover such old report data, and this case is unhandled from the system.

This case is quite frequent, for instance all EU stock before 2000 would fall in it.

Moreover, is not a bug but an appreciable feature, I'd like to know which currency is the data in the dataframe, that actually is hidden and could change in the years so cannot be simply deducted from the ticker current data

Describe how you would like the feature improved A simpler but effective solution for both the problems would be add a column in the dataframe indicating the values currency, such that:

Possibly describe the ideal way to improve this A more sophisticated approach could be give the user some option about what to do when some data row currency cannot be converted, for instance a parameter currency_convertion_strategy with options:

Additional information Here the error stack for SONY from 1970-01-01 with convert_currency=True

-> [311](file:///C:/Users/l_c/Documents/Python/algos5/Crawler.py:311) cf = toolkit.get_cash_flow_statement(trailing=1)

File [c:\Users\l_c\Documents\Python\algos5\.venv\Lib\site-packages\financetoolkit\toolkit_controller.py:3160]([...]/financetoolkit/toolkit_controller.py:3160), in Toolkit.get_cash_flow_statement(self, overwrite, rounding, growth, lag, trailing, progress_bar)
   [3152]([...]/financetoolkit/toolkit_controller.py:3152) if convert_currency:
   [3153]([...]/financetoolkit/toolkit_controller.py:3153)     self.get_exchange_rates(
   [3154]([...]/financetoolkit/toolkit_controller.py:3154)         period="quarterly" if self._quarterly else "yearly",
   [3155]([...]/financetoolkit/toolkit_controller.py:3155)         progress_bar=progress_bar
   [3156]([...]/financetoolkit/toolkit_controller.py:3156)         if progress_bar is not None
   [3157]([...]/financetoolkit/toolkit_controller.py:3157)         else self._progress_bar,
   [3158]([...]/financetoolkit/toolkit_controller.py:3158)     )
-> [3160]([...]/financetoolkit/toolkit_controller.py:3160)     cash_flow_statement = helpers.convert_currencies(
   [3161]([...]/financetoolkit/toolkit_controller.py:3161)         financial_statement_data=cash_flow_statement,
   [3162]([...]/financetoolkit/toolkit_controller.py:3162)         financial_statement_currencies=self._statement_currencies,
   [3163]([...]/financetoolkit/toolkit_controller.py:3163)         exchange_rate_data=self.get_exchange_rates(
   [3164]([...]/financetoolkit/toolkit_controller.py:3164)             period="quarterly" if self._quarterly else "yearly"
   [3165]([...]/financetoolkit/toolkit_controller.py:3165)         )["Adj Close"],
   [3166]([...]/financetoolkit/toolkit_controller.py:3166)         financial_statement_name="cash flow statement",
   [3167]([...]/financetoolkit/toolkit_controller.py:3167)     )
   [3169]([...]/financetoolkit/toolkit_controller.py:3169) cash_flow_statement = cash_flow_statement.round(
   [3170]([...]/financetoolkit/toolkit_controller.py:3170)     rounding if rounding else self._rounding
   [3171]([...]/financetoolkit/toolkit_controller.py:3171) )
   [3173]([...]/financetoolkit/toolkit_controller.py:3173) if len(self._tickers) == 1 and not self._cash_flow_statement.empty:

File [c:\Users\l_c\Documents\Python\algos5\.venv\Lib\site-packages\financetoolkit\helpers.py:190]([...]/financetoolkit/helpers.py:190), in convert_currencies(financial_statement_data, financial_statement_currencies, exchange_rate_data, items_not_to_adjust, financial_statement_name)
    [183]([...]/financetoolkit/helpers.py:183)         else:
    [184]([...]/financetoolkit/helpers.py:184)             items_to_adjust = (
    [185]([...]/financetoolkit/helpers.py:185)                 financial_statement_data.index.get_level_values(level=1)
    [186]([...]/financetoolkit/helpers.py:186)             )
    [188]([...]/financetoolkit/helpers.py:188)         financial_statement_data.loc[(ticker, items_to_adjust), :] = (
    [189]([...]/financetoolkit/helpers.py:189)             financial_statement_data.loc[(ticker, items_to_adjust), :].mul(
--> [190]([...]/financetoolkit/helpers.py:190)                 exchange_rate_data.loc[periods, currency], axis=1
    [191]([...]/financetoolkit/helpers.py:191)             )
    [192]([...]/financetoolkit/helpers.py:192)         ).to_numpy()
    [194]([...]/financetoolkit/helpers.py:194)         currencies[currency].append(ticker)
    [195]([...]/financetoolkit/helpers.py:195) else:

File [c:\Users\l_c\Documents\Python\algos5\.venv\Lib\site-packages\pandas\core\indexing.py:1184]([...]/pandas/core/indexing.py:1184), in _LocationIndexer.__getitem__(self, key)
   [1182]([...]/pandas/core/indexing.py:1182)     if self._is_scalar_access(key):
   [1183]([...]/pandas/core/indexing.py:1183)         return self.obj._get_value(*key, takeable=self._takeable)
-> [1184]([...]/pandas/core/indexing.py:1184)     return self._getitem_tuple(key)
   [1185]([...]/pandas/core/indexing.py:1185) else:
   [1186]([...]/pandas/core/indexing.py:1186)     # we by definition only have the 0th axis
   [1187]([...]/pandas/core/indexing.py:1187)     axis = self.axis or 0

File [c:\Users\l_c\Documents\Python\algos5\.venv\Lib\site-packages\pandas\core\indexing.py:1371]([...]/pandas/core/indexing.py:1371), in _LocIndexer._getitem_tuple(self, tup)
   [1368]([...]/pandas/core/indexing.py:1368)     return self._getitem_lowerdim(tup)
   [1370]([...]/pandas/core/indexing.py:1370) # no multi-index, so validate all of the indexers
-> [1371]([...]/pandas/core/indexing.py:1371) tup = self._validate_tuple_indexer(tup)
   [1373]([...]/pandas/core/indexing.py:1373) # ugly hack for GH #836
   [1374]([...]/pandas/core/indexing.py:1374) if self._multi_take_opportunity(tup):

File [c:\Users\l_c\Documents\Python\algos5\.venv\Lib\site-packages\pandas\core\indexing.py:962]([...]/pandas/core/indexing.py:962), in _LocationIndexer._validate_tuple_indexer(self, key)
    [957]([...]/pandas/core/indexing.py:957) @final
    [958]([...]/pandas/core/indexing.py:958) def _validate_tuple_indexer(self, key: tuple) -> tuple:
    [959]([...]/pandas/core/indexing.py:959)     """
    [960]([...]/pandas/core/indexing.py:960)     Check the key for valid keys across my indexer.
    [961]([...]/pandas/core/indexing.py:961)     """
--> [962]([...]/pandas/core/indexing.py:962)     key = self._validate_key_length(key)
    [963]([...]/pandas/core/indexing.py:963)     key = self._expand_ellipsis(key)
    [964]([...]/pandas/core/indexing.py:964)     for i, k in enumerate(key):

File [c:\Users\l_c\Documents\Python\algos5\.venv\Lib\site-packages\pandas\core\indexing.py:1001]([...]/pandas/core/indexing.py:1001), in _LocationIndexer._validate_key_length(self, key)
    [999]([...]/pandas/core/indexing.py:999)             raise IndexingError(_one_ellipsis_message)
   [1000]([...]/pandas/core/indexing.py:1000)         return self._validate_key_length(key)
-> [1001]([...]/pandas/core/indexing.py:1001)     raise IndexingError("Too many indexers")
   [1002]([...]/pandas/core/indexing.py:1002) return key

IndexingError: Too many indexers
JerBouma commented 7 months ago

Hi! Thank you for reporting this.

Unfortunately it is quite complicated to track the individual currency changes of a company given that data sources tend to auto adjust for past prices based on the current currency.

For example ASML has been listed since 1995 and is reported in Euro even though the Euro itself came into existence in 1999. This means that if you would use a financial statement of 1995 denominated in Dutch's Gulden it wouldn't match up with the historical data and you couldn't convert it to Euro given there is no historical data available.

For the time being it would work to split up the collection of data before 2000 and after 2000 so that a subset has its currency converted and the other subset has not and pay extra attention to data before 2000. This will also remain a tricky area even in the financial industry itself.

LucaColombi commented 7 months ago

hello,

that workaround work just for EU financial statements, but for SONY or PBR or other doesnt work at all

the system break requesting older data with auto currency translation enabled

an important deatil I omitted, this happens requesting data from FMP

but, this also give a solution because FMP gives explicit data about currency of each single statement:

https://site.financialmodelingprep.com/developer/docs#balance-sheet-statements-financial-statements

[
    {
        "date": "2022-09-24",
        "symbol": "AAPL",
        "reportedCurrency": "USD",
        "cik": "0000320193",

could you just copy that column reportedCurrency in the dataframe as is? I can retrieve currency data apart and do all the other workaround

Users in my situation can disable the auto conversion and handle it manually with just this column added, also could be useful for users to know the original financial statement currency

JerBouma commented 7 months ago

hello,

that workaround work just for EU financial statements, but for SONY or PBR or other doesnt work at all

the system break requesting older data with auto currency translation enabled

an important deatil I omitted, this happens requesting data from FMP

but, this also give a solution because FMP gives explicit data about currency of each single statement:

https://site.financialmodelingprep.com/developer/docs#balance-sheet-statements-financial-statements

[
  {
      "date": "2022-09-24",
      "symbol": "AAPL",
      "reportedCurrency": "USD",
      "cik": "0000320193",

could you just copy that column reportedCurrency in the dataframe as is? I can retrieve currency data apart and do all the other workaround

Users in my situation can disable the auto conversion and handle it manually with just this column added, also could be useful for users to know the original financial statement currency

Hi @LucaColombi,

I actually collect this information automatically through get_statistics_statement. You could have a look if this works for you. See: https://www.jeroenbouma.com/projects/financetoolkit/docs#get_statistics_statement

I'll most likely build in a warning when currency transformation fails given that it should work fine for most companies after 2000.

Please let me know if this helps and what you come up with.

JerBouma commented 7 months ago

Hi @LucaColombi,

I wasn't able to grab my laptop for a while but I was looking into this and everything seems to work fine from my side. Are you on the latest release (v1.8.5) and are you on a Premium FMP plan?

For example this is SONY with convert_currency=True:

image

When I look at ASML, this also works just fine:

image

And the Statistics Statement for both:

image

Unless you experience different results, this issue seems to be resolved.

LucaColombi commented 7 months ago

hello,

upgraded from 1.8.4 to 1.8.5

seems to work, by curiosity what changed from .4 to .5? I installed it just one month ago and it downloaded the .4 automatically

also, could I ask for 1 detail? in FMP the statement currency is declared statement by statement, so when the library converts currency, do it uses the single statement currency, or always the same last anagraphical currency for the stock? because non US stock could have changed the currency in the time and using the last one could false older report, just want to be sure

JerBouma commented 7 months ago

I keep track of my releases in the "Releases" section, see: https://github.com/JerBouma/FinanceToolkit/releases/tag/v1.8.5

Regarding your comment. It matches the currency per year that is in the Statistics Statement. However, this tends to be the same currency most of the time and in some cases, the financial statements are already translated to USD (or EUR in the case of ASML).