Closed thedanhub closed 1 year ago
This is because all API's actually return tuples with two elements. In some methods (like your first example) the second element will always be None
and can safely be ignored. But on the second example you're destructuring the tuple, so Python makes it look like there's no tuple at all!
For the first example you want to do this instead:
>>> data, _ = fd.get_company_overview(symbol='IBM')
>>> print(type(data))
<class 'pandas.core.frame.DataFrame'>
Likewise:
>>> data = fd.get_company_overview(symbol='IBM')
>>> print(type(data[0]))
<class 'pandas.core.frame.DataFrame'>
@ddavness:
In some methods (like your first example) the second element will always be None and can safely be ignored.
This helps for this specific example, thank you! I will use the unnamed throwaway variable name from your first example.
But on the second example you're destructuring the tuple, so Python makes it look like there's no tuple at all!
I think in general this is exactly the point: the developer here is not destructuring anything really, we are just getting a response from the API which has already destructured the tuple. I think the API should be consistent in the response format: you either make the destructuring transparent everywhere, or you just return a tuple everywhere.
Calling get_income_statement_annual()
from that module with output_format = 'json'
returns a Pandas dataframe.
Running
from alpha_vantage.fundamentaldata import FundamentalData
fd = FundamentalData(key='YOUR_API_KEY', output_format='json')
income_statement, symbol = client.get_income_statement_annual(symbol='IBM')
print(type(income_statement))
prints
<class 'pandas.core.frame.DataFrame'>
On the plus side, I now know that I have accidentally installed Pandas as a dependency of something else.
If I forcibly remove the Pandas package and try again, I get:
File "/Users/bfoz/.local/share/virtualenvs/project04-bNStgEKY/lib/python3.10/site-packages/alpha_vantage/alphavantage.py", line 250, in _format_wrapper
data_pandas = pandas.DataFrame(data_array, columns=[
NameError: name 'pandas' is not defined
With the caveat that I'm not familiar with the code, the problem appears to be around line 238 of alphavantage.py
if output_format == 'json':
if isinstance(data, list):
# If the call returns a list, then we will append them
# in the resulting data frame. If in the future
# alphavantage decides to do more with returning arrays
# this might become buggy. For now will do the trick.
if not data:
data_pandas = pandas.DataFrame()
else:
data_array = []
for val in data:
data_array.append([v for _, v in val.items()])
data_pandas = pandas.DataFrame(data_array, columns=[
k for k, _ in data[0].items()])
return data_pandas, meta_data
else:
return data, meta_data
That looks to me like it returns a DataFrame even when output_format
is set to json
.
I have opened a PR that addresses this at the end of last year - see #329.
It was probably a result of copy-pasting the same thing twice.
I eventually found your PR, but not until after I figured it out myself. I guess I should have looked first.
So, is this project dead then? It hasn't seen any updates in 13 months and your PR is almost 8 months old.
Closing for now. Will reopen if there is further community response. Thanks!
Looking at the code of
fundamentaldata.py
, it seems like other than raising an error in caseoutput_format
is set tocsv
, the constructor is ignoring the output format preferences altogether, and returning the results in tuple format only instead.Example FundamentalData (not working)
This returns
<class 'tuple'>
instead of a DataFrame.Example TimeSeries (working)
In contrast, the TimeSeries object is working fine, and returns the correct data format:
This correctly returns
<class 'pandas.core.frame.DataFrame'>
.Is there a way to make FundamentalData respect the
output_format
preference like TimeSeries does?