alvarobartt / investpy

Financial Data Extraction from Investing.com with Python
https://investpy.readthedocs.io/
MIT License
1.64k stars 377 forks source link

allow different input type when calling data retrieval functions #147

Open alvarobartt opened 4 years ago

alvarobartt commented 4 years ago

As already reported by @ymyke in #128, some financial products have ambiguous names, which leads to a wrong data retrieval since some products that are indexed among investpy's static data can't be retrieved since they have the same name in the same country, etc. This means that some data from indexed financial products can not be retrieved.

So as to solve this, a report will be created on how investpy covers all these products and how to retrieve the complete information for them without overlapping and information missing.

This issue proposes to let the user specify via parameter (e.g. input_type) the name of the field which combined with the country (if applicable) can lead to retrieve the data from a certain financial product.

For example, as presented by @ymyke, stocks indexed in investpy have the following coverage table:

# Stocks:
Unambiguous coverage for name: 97.65%
Unambiguous coverage for symbol: 98.05%
Unambiguous coverage for isin: 97.70%

So that the symbol is the most (and the current) suitable parameter to find the specified stock along with the country. But there are some stocks that still have the same symbol in the same country, so the report to create should aim not just to identify the most suitable input parameter, but how to improve the data extraction and include more parameters if needed such as what has been made for ETFs including the stock_exchange parameter.

markus080402 commented 4 years ago

On request from @alvarobartt , the below "sub-issue" is copied from #116 to here.


Hi @alvarobartt , great, don't work too hard :)

Regarding all "get" functions, for example

investpy.stocks.get_stock_information() investpy.stocks.get_stock_company_profile()

etc

It would be convenient if they always returned the isin, since it is guaranteed to be unique globally.

However, there seem to be 6424 symbols with same name:

len(investpy.stocks.get_stocks_list(country=None)) - len(set(investpy.stocks.get_stocks_list(country=None))) 6424

Ofcourse we can include the country as additional selector, but isin is quite useful since it is the same on all platforms in the world.

Thanks and have a nice weekend, Markus

markus080402 commented 4 years ago

Hi @alvarobartt

Seems isin is actually not unique in investpy, there are 3333 duplicates of isin also:

>>> all_stocks = investpy.stocks.get_stocks(country=None)

>>> len(list(all_stocks['isin'])) - len(set(all_stocks['isin']))
3333
>>>

What is the reason for duplicates of isin and can we fix it so there are no duplicates?