RomelTorres / alpha_vantage

A python wrapper for Alpha Vantage API for financial data.
MIT License
4.3k stars 741 forks source link

fundamentaldata.py ignoring output_format? #328

Closed thedanhub closed 1 year ago

thedanhub commented 3 years ago

Looking at the code of fundamentaldata.py, it seems like other than raising an error in case output_format is set to csv, the constructor is ignoring the output format preferences altogether, and returning the results in tuple format only instead.

Example FundamentalData (not working)

from alpha_vantage.fundamentaldata import FundamentalData
fd = FundamentalData(key='YOUR_API_KEY', output_format='pandas')
data = fd.get_company_overview(symbol='IBM')
print(type(data))

This returns <class 'tuple'> instead of a DataFrame.

Example TimeSeries (working)

In contrast, the TimeSeries object is working fine, and returns the correct data format:

from alpha_vantage.timeseries import TimeSeries
ts = TimeSeries(key='YOUR_API_KEY', output_format='pandas', indexing_type='date')
data, meta_data = ts.get_daily(symbol='IBM', outputsize='full')
print(type(data))

This correctly returns <class 'pandas.core.frame.DataFrame'>.

Is there a way to make FundamentalData respect the output_format preference like TimeSeries does?

ddavness commented 2 years ago

This is because all API's actually return tuples with two elements. In some methods (like your first example) the second element will always be None and can safely be ignored. But on the second example you're destructuring the tuple, so Python makes it look like there's no tuple at all!

For the first example you want to do this instead:

>>> data, _ = fd.get_company_overview(symbol='IBM')
>>> print(type(data))
<class 'pandas.core.frame.DataFrame'>

Likewise:

>>> data = fd.get_company_overview(symbol='IBM')
>>> print(type(data[0]))
<class 'pandas.core.frame.DataFrame'>
thedanhub commented 2 years ago

@ddavness:

In some methods (like your first example) the second element will always be None and can safely be ignored.

This helps for this specific example, thank you! I will use the unnamed throwaway variable name from your first example.

But on the second example you're destructuring the tuple, so Python makes it look like there's no tuple at all!

I think in general this is exactly the point: the developer here is not destructuring anything really, we are just getting a response from the API which has already destructured the tuple. I think the API should be consistent in the response format: you either make the destructuring transparent everywhere, or you just return a tuple everywhere.

bfoz commented 2 years ago

Calling get_income_statement_annual() from that module with output_format = 'json' returns a Pandas dataframe.

Running

from alpha_vantage.fundamentaldata import FundamentalData
fd = FundamentalData(key='YOUR_API_KEY', output_format='json')
income_statement, symbol = client.get_income_statement_annual(symbol='IBM')
print(type(income_statement))

prints

<class 'pandas.core.frame.DataFrame'>

On the plus side, I now know that I have accidentally installed Pandas as a dependency of something else.

bfoz commented 2 years ago

If I forcibly remove the Pandas package and try again, I get:

  File "/Users/bfoz/.local/share/virtualenvs/project04-bNStgEKY/lib/python3.10/site-packages/alpha_vantage/alphavantage.py", line 250, in _format_wrapper
    data_pandas = pandas.DataFrame(data_array, columns=[
NameError: name 'pandas' is not defined

With the caveat that I'm not familiar with the code, the problem appears to be around line 238 of alphavantage.py

                if output_format == 'json':
                    if isinstance(data, list):
                        # If the call returns a list, then we will append them
                        # in the resulting data frame. If in the future
                        # alphavantage decides to do more with returning arrays
                        # this might become buggy. For now will do the trick.
                        if not data:
                            data_pandas = pandas.DataFrame()
                        else:
                            data_array = []
                            for val in data:
                                data_array.append([v for _, v in val.items()])
                            data_pandas = pandas.DataFrame(data_array, columns=[
                                k for k, _ in data[0].items()])
                        return data_pandas, meta_data
                    else:
                        return data, meta_data

That looks to me like it returns a DataFrame even when output_format is set to json.

ddavness commented 2 years ago

I have opened a PR that addresses this at the end of last year - see #329.

It was probably a result of copy-pasting the same thing twice.

bfoz commented 2 years ago

I eventually found your PR, but not until after I figured it out myself. I guess I should have looked first.

So, is this project dead then? It hasn't seen any updates in 13 months and your PR is almost 8 months old.

AlphaVantageSupport commented 1 year ago

Closing for now. Will reopen if there is further community response. Thanks!