portfolio-performance / portfolio

Track and evaluate the performance of your investment portfolio across stocks, cryptocurrencies, and other assets.
http://www.portfolio-performance.info
Eclipse Public License 1.0
2.75k stars 576 forks source link

Feature Request: Extract ETF Asset Allocation (Countries, Sectors, Currencies) from ETF Holdings List #1665

Open fbuchinger opened 3 years ago

fbuchinger commented 3 years ago

In the sales prospectuses of ETFs there is never a comprehensive country / currency / sector classification, mostly it reads: "55% USA, 7% Japan, ..., 12% other". If you have several ETFs, these inaccuracies can build up and you start a rebalancing, although there is no need to do so. By parsing the fund holdings reports provided by the ETF emittents, we can provide an accurate and automatically updateable ETF/fund classification.

As the manager of an ETF portfolio using PP, I want an automatically updated asset classification so that I can rebalance my portfolio properly and make informed investment decisions

How could this feature work?

For each fund, the user has to configure the following attributes in the security properties (Ctrl+E):

In addition, the user needs to define in the global settings which classifications should be auto-generated for the portfolio:

After confirming these changes, the configured classifications are auto-generated and updated at the defined intervals.

Open Questions:

fbuchinger commented 3 years ago

Update: I wrote a small python script that automatically classifies ETFs by their invested currencies as a small proof-of-concept. I plan to extend the auto-classification to industries (Branchen), Countries and maybe even the Top 10 shares held by an ETF portfolio. Data source for the classification is the list of etf assets as provided by the ETF issuer (iShares, xtrackers etc)

How it works:

  1. edit each ETF security, an add a "Provider" (Anbieter) attribute on the attributes tab. The value of the attribute is the URL of the holdings list that can be obtained from the ETF issuer website, e.g. https://www.ishares.com/de/privatanleger/de/produkte/270051/ishares-msci-world-momentum-factor-ucits-etf -> Link to "Fondspositionen und Kennzahlen")
  2. make a copy of your Portfolio performance file as unencrypted XML (e.g. pp.xml). Keep the original
  3. Run the script python etf-classifier.py <path to pp.xml> > classified.pp.xml Important: NEVER overwrite the original pp.xml, the classification is experimental and can lead to data loss!!!!
  4. open classified.pp.xml in Portfolio Performance and check out the additional "currency" conversion.

UPDATE: here is my experimental Python script: https://github.com/fbuchinger/etf-classification-pp and two before/after-autoclassification PP files autoclassifier.zip

NEW - Top 10 Holdings of ETF Portfolio and their share compared to the overall portfolio
image

Here is a screenshot of the auto-classified ETF currencies:

image

and the aggregated country exposure (TODO: group by regions): image

and finally the industries (Branchen): image

CC: #1775, #503

fbuchinger commented 3 years ago

@buchen @Ragas13 any feedback?

bosix commented 3 years ago

I would love to see this feature been implemented.

Maybe an api should be used for fetching etf details instead of parsing a webside. Currently, I'm working with https://etf-data.com for another project but they only provide sector and region data.

bosix commented 3 years ago

Well, since it's a task I do not very often (only a couple of times a year or if I add a new etf to my portfolio), I built a cli tool that fetches the etfs from portfolio performance, extract the data needed from onvista.de and add everything to the portfolio xml.

Unfortunately, I can't get my ide up ready to implement the code directly into portfolio performance but it's working for me very well with the workaround since I also implemented an option to update existing classifications and a mapping mechanism for equal but different labeled countries/industries/etc. (e.g. USA vs. United States vs, United States of America). If anyone is interested, I can share the tool or the source code as well.

Here a screen of the result with nested countries: image

Thanks to @fbuchinger for the inspiration and for sharing the proof of concept.

fbuchinger commented 3 years ago

@bosix nice screenshot! Does your tool modify the portfolio.xml in-place? I was also thinking about updating the generated classifications, but this is a multi-faceted problem as you would need to discern them from manually created classifications. Furthermore, the user shouldn't modify auto-generated classifications.

In that regard, I like your "(auto-generated)" title approach. Simple, but efficient. I would even go one step further and add the generation date to the title - this allows the user to spot outdated classifications quickly.

How do you identify the ISINs that belong to ETFs/funds (vs those belonging to shares)? Does onvista provide an official API or does you solution rely on screenscraping? I am definitely interested in the source code of your tool.

bosix commented 3 years ago

@fbuchinger thank you :).

Yes, the tool modifies the data in place. It creates a backup before that just to be sure and you should not have portfolio performance open when updating the file since I cannot say what actually happens when you do this. Also, as you said: if the user modifies the auto-generated classifications, all changes are overwritten the next time the tool goes over it.

While developing the tool there were two ideas in my mind to detect the created classifications:

  1. give them a special label (the method I did)
  2. give their ids a prefix You could prefix the ids of the taxonomies to identify them and add the date to the label so the user sees when the classification's last update was.

About detecting etfs: well...I just don't :smile:. If you ask onvista towards an isin which is no etf/fond, it just returns no data so I go through all isins and if you have e.g. a share directly in your depot (amazon, google, ...) it will be unclassified in the auto-generated classifications. It would be an improvement in the future to fetch data for them as well but at the moment that is no priority for me. The api I use is semi-official I think. Onvista draws nice charts on their page via javascript and this javascript just calls a backend endpoint together with the isin to get the displayed data and I do the same in my tool. Here an example with curl:

curl --request POST \
    --url https://www.onvista.de/etf/ajax/etfBreakdownList \
    --header 'Content-Type: multipart/form-data' \
    --form isin=DE0009848119 \
    --form type=countries

Don't be confused by the path. My short tests prove that the endpoint works with all kinds of fonds (afaik).

I've uploaded the source code here. It's a bit messed up yet but I hope you can get the idea. Just for a short overview:

A short notice on locales: currently, the way I fetch and process data, the output for labels will always be german. Another api would be needed to support different languages.

fmms commented 3 years ago

For me this would be a super awesome feature. Just as an idea there is the Finnhub API https://finnhub.io/docs/api#etfs-industry-exposure which might bring some of the needed classifications without a dependency to the issuer.

layoutanalysis commented 3 years ago

@fmms thanks for the input! I already pointed out in #1775 that the Finnhub API would be a better choice if the auto-classification feature was implemented in PP directly. PP already uses the Finnhub API in a few places and would just need to implement some additional calls.

The only downside is that you will loose the "Top 10 Holdings across all ETFs" chart (since it requires a complete holding list of all ETFs which the Finnhub API doesn't provide yet).

fbuchinger commented 3 years ago

@fmms I gave the Finnhub API a try. The good news: you can even access the fund holding list, with would allow a "Top 10 Holdings" chart (not sure if this is possible in the free tier forever).

The bad news: ETF data is rather incomplete, I was unable to get industry/country info for the following ETFs: IUSN, EUNL, IS3N, EUNK and CEMR.

Another think i noticed is that the industry classification doesn't follow any methodology - categories like "Specialty Retail" neither appear in the GICS methodology used by MSCI nor in the ICB framework used by FTSE.

ghost commented 3 years ago

The bad news: ETF data is rather incomplete, I was unable to get industry/country info for the following ETFs: IUSN, EUNL, IS3N, EUNK and CEMR

I've just tried with the Ishares ENUK one, which is ISIN IE00B4K48X80. And gathering the ISIN via finnhhb API with the API key I get the following response:

https://finnhub.io/api/v1/etf/country?isin=IE00B4K48X80&token=xxxxxxxxxxx

{
   "countryExposure":[
      {
         "country":"United Kingdom of Great Britain and Northern Ireland (the)",
         "exposure":20.72
      },
      {
         "country":"France",
         "exposure":16.56
      },
      {
         "country":"Switzerland",
         "exposure":15.84
      },
      {
         "country":"Germany",
         "exposure":14.81
      },
      {
         "country":"Netherlands (the)",
         "exposure":8.51
      },
      {
         "country":"Sweden",
         "exposure":5.08
      },
      {
         "country":"Denmark",
         "exposure":3.95
      },
      {
         "country":"Spain",
         "exposure":3.91
      },
      {
         "country":"Italy",
         "exposure":3.32
      },
      {
         "country":"Finland",
         "exposure":1.95
      },
      {
         "country":"Ireland",
         "exposure":1.55
      },
      {
         "country":"Belgium",
         "exposure":1.47
      },
      {
         "country":"Norway",
         "exposure":0.95
      },
      {
         "country":"Luxembourg",
         "exposure":0.44
      },
      {
         "country":"Austria",
         "exposure":0.28
      },
      {
         "country":"Portugal",
         "exposure":0.27
      },
      {
         "country":"Isle of Man",
         "exposure":0.09
      },
      {
         "country":"Mexico",
         "exposure":0.03
      }
   ],
   "symbol":"SMEA.L"
}

https://finnhub.io/api/v1/etf/country?isin=IE00B4K48X80&token=xxxxxxxxxxx

{
   "sectorExposure":[
      {
         "exposure":10.45,
         "industry":"Pharmaceuticals"
      },
      {
         "exposure":6.66,
         "industry":"Banks"
      },
      {
         "exposure":5.27,
         "industry":"Insurance"
      },
      {
         "exposure":4.77,
         "industry":"Textiles, Apparel \u0026 Luxury Goods"
      },
      {
         "exposure":4.47,
         "industry":"Food Products"
      },
      {
         "exposure":4.14,
         "industry":"Chemicals"
      },
      {
         "exposure":3.78,
         "industry":"Oil, Gas \u0026 Consumable Fuels"
      },
      {
         "exposure":2.97,
         "industry":"Electric Utilities"
      },
      {
         "exposure":2.79,
         "industry":"Semiconductors \u0026 Semiconductor Equipment"
      },
      {
         "exposure":2.76,
         "industry":"Beverages"
      },
      {
         "exposure":2.72,
         "industry":"Capital Markets"
      },
      {
         "exposure":2.69,
         "industry":"Metals \u0026 Mining"
      },
      {
         "exposure":2.63,
         "industry":"Personal Products"
      },
      {
         "exposure":2.37,
         "industry":"Electrical Equipment"
      },
      {
         "exposure":2.36,
         "industry":"Machinery"
      },
      {
         "exposure":2.25,
         "industry":"Diversified Telecommunication Services"
      },
      {
         "exposure":2.12,
         "industry":"Automobiles"
      },
      {
         "exposure":1.98,
         "industry":"Software"
      },
      {
         "exposure":1.97,
         "industry":"Health Care Equipment \u0026 Supplies"
      },
      {
         "exposure":1.82,
         "industry":"Professional Services"
      },
      {
         "exposure":1.74,
         "industry":"Aerospace \u0026 Defense"
      },
      {
         "exposure":1.72,
         "industry":"Other"
      },
      {
         "exposure":1.6,
         "industry":"IT Services"
      },
      {
         "exposure":1.44,
         "industry":"Multi-Utilities"
      },
      {
         "exposure":1.32,
         "industry":"Hotels, Restaurants \u0026 Leisure"
      },
      {
         "exposure":1.3,
         "industry":"Internet \u0026 Direct Marketing Retail"
      },
      {
         "exposure":1.3,
         "industry":"Industrial Conglomerates"
      },
      {
         "exposure":1.21,
         "industry":"Tobacco"
      },
      {
         "exposure":1.06,
         "industry":"Construction \u0026 Engineering"
      },
      {
         "exposure":1.04,
         "industry":"Food \u0026 Staples Retailing"
      },
      {
         "exposure":1.03,
         "industry":"Household Products"
      },
      {
         "exposure":0.97,
         "industry":"Building Products"
      },
      {
         "exposure":0.85,
         "industry":"Air Freight \u0026 Logistics"
      },
      {
         "exposure":0.8,
         "industry":"Diversified Financial Services"
      },
      {
         "exposure":0.8,
         "industry":"Life Sciences Tools \u0026 Services"
      },
      {
         "exposure":0.79,
         "industry":"Real Estate Management \u0026 Development"
      },
      {
         "exposure":0.73,
         "industry":"Construction Materials"
      },
      {
         "exposure":0.71,
         "industry":"Trading Companies \u0026 Distributors"
      },
      {
         "exposure":0.67,
         "industry":"Specialty Retail"
      },
      {
         "exposure":0.59,
         "industry":"Communications Equipment"
      },
      {
         "exposure":0.57,
         "industry":"Equity Real Estate Investment Trusts (REITs)"
      },
      {
         "exposure":0.54,
         "industry":"Paper \u0026 Forest Products"
      },
      {
         "exposure":0.53,
         "industry":"Auto Components"
      },
      {
         "exposure":0.53,
         "industry":"Wireless Telecommunication Services"
      },
      {
         "exposure":0.53,
         "industry":"Biotechnology"
      },
      {
         "exposure":0.51,
         "industry":"Household Durables"
      },
      {
         "exposure":0.51,
         "industry":"Health Care Providers \u0026 Services"
      },
      {
         "exposure":0.5,
         "industry":"Media"
      },
      {
         "exposure":0.4,
         "industry":"Entertainment"
      },
      {
         "exposure":0.38,
         "industry":"Electronic Equipment, Instruments \u0026 Components"
      },
      {
         "exposure":0.33,
         "industry":"Transportation Infrastructure"
      },
      {
         "exposure":0.26,
         "industry":"Marine"
      },
      {
         "exposure":0.21,
         "industry":"Gas Utilities"
      },
      {
         "exposure":0.21,
         "industry":"Interactive Media \u0026 Services"
      },
      {
         "exposure":0.18,
         "industry":"Commercial Services \u0026 Supplies"
      },
      {
         "exposure":0.17,
         "industry":"Water Utilities"
      },
      {
         "exposure":0.16,
         "industry":"Technology Hardware, Storage \u0026 Peripherals"
      },
      {
         "exposure":0.13,
         "industry":"Multiline Retail"
      },
      {
         "exposure":0.12,
         "industry":"Containers \u0026 Packaging"
      },
      {
         "exposure":0.08,
         "industry":"Independent Power and Renewable Electricity Producers"
      },
      {
         "exposure":0.04,
         "industry":"Airlines"
      },
      {
         "exposure":0.04,
         "industry":"Energy Equipment \u0026 Services"
      }
   ],
   "symbol":"SMEA.L"
}
fbuchinger commented 3 years ago

@Ragas13 thanks for pointing this out! I just tried the symbol/ticker, since this is the only query method provided in the official finnhub python API. Seems not all ticker symbols of a security are linked to the isin...

Another question: is this feature already being implemented for pp or do we need further prototypes?

metal450 commented 3 years ago

I was so excited when the updated was posted with the ability to pull Sector & Country allocations...but...then I reaized that etf-data.com only includes data for European-listed equities, not US. Bummer. Fingers crossed that another API might be added for us US users as well :)

maddhin commented 1 year ago

maybe a stupid question, but. has this feature been implemented? Would be super-awesome!

fbuchinger commented 1 year ago

Hi @maddhin, This feature was implemented and used etf-data.com as a data provider. Unfortunately etf-data.com no longer delivers holding data for funds. I was actually playing with the idea to publish a data repo containing the holding data of ETFs here on github, since most ETF vendors publish csv/excel files with holding information of each ETF on their website. But then I realized that this data is copyright-protected. Furthermore MSCI and FTSE use different classifications for industries and regions that would have to be harmonized.

hmonteiro commented 1 month ago

Sorry for the necropost, but just realised that Yahoo Finance seems to hold the data on the sector weightings/top holdings. Could this be revived to make use of that? Even if it turns out incomplete, it beats having to manually get them for all tickers in the portfolio. Example: https://finance.yahoo.com/quote/IMEA.SW/holdings/

Sn1kk3r5 commented 5 days ago

@hmonteiro

you could make use of this: https://forum.portfolio-performance.info/t/automatic-import-of-classifications/14672/101

metal450 commented 4 days ago

Seems like the forum is broken :( 504 Gateway Time-out every time.

Sn1kk3r5 commented 4 days ago

The forum works properly, must be something on your side

metal450 commented 4 days ago

Looks like it's back up now. It was definitely down yesterday - maybe just a temporary fluke. Thanks! :)