matthewgilbert / pdblp

pandas wrapper for Bloomberg Open API
MIT License
242 stars 67 forks source link

bdh() gives value error when no data is returned #41

Closed richardtbenade closed 6 years ago

richardtbenade commented 6 years ago
import pdblp

con =pdblp.BCon(debug=False, port=8194)
con.start()

df = con.bdh(tickers=['AAPL US Equity', '1437355D US Equity', '1630887D US Equity'],
             flds=['PX_LAST'],
             start_date='19970101',
             end_date='20180511',
             elms = [('nonTradingDayFillMethod','PREVIOUS_VALUE'),('nonTradingDayFillOption','ALL_CALENDAR_DAYS')],
             ovrds=[],
             longdata=True)
# Your code here, this should be a minimal reproducible example, see https://stackoverflow.com/help/mcve
import pdblp

con =pdblp.BCon(debug=False, port=8194)
con.start()

# historical:
df = con.bdh(tickers=['1437355D US Equity', '1630887D US Equity'],
             flds=['PX_LAST'],
             start_date='19970101',
             end_date='20180511',
             elms = [('nonTradingDayFillMethod','PREVIOUS_VALUE'),('nonTradingDayFillOption','ALL_CALENDAR_DAYS')],
             ovrds=[],
             longdata=True)

Problem description

Pulling prices for the bloomberg tickers in the example (when including an active stock AAPL US Equity) yields a dataframe as expected. However when removing AAPL US Equity from the list and rerunning I get the error below. These two tickers are delisted and in this case they do not have any data. Is it not possible to return an empty dataframe instead of the error in this case?
Great library though, thanks.

Expected Output

Empty dataframe

Version Information

pdblp==0.1.5

Output

05JUL2018_17:58:49.698 4188:5996 WARN blpapi_platformcontroller.cpp:347 blpapi.session.platformcontroller.{1} Connectivity restored. 
Traceback (most recent call last):
  File "C:/repos/de_003/scratchpad_bloomberg_api_raw.py", line 40, in <module>
    longdata=True)
  File "C:\venv\de_003\lib\site-packages\pdblp\pdblp.py", line 168, in bdh
    df.columns = ["date", "ticker", "field", "value"]
  File "C:\venv\de_003\lib\site-packages\pandas\core\generic.py", line 4385, in __setattr__
    return object.__setattr__(self, name, value)
  File "pandas\_libs\properties.pyx", line 69, in pandas._libs.properties.AxisProperty.__set__
  File "C:\venv\de_003\lib\site-packages\pandas\core\generic.py", line 645, in _set_axis
    self._data.set_axis(axis, labels)
  File "C:\venv\de_003\lib\site-packages\pandas\core\internals.py", line 3323, in set_axis
    'values have {new} elements'.format(old=old_len, new=new_len))
ValueError: Length mismatch: Expected axis has 0 elements, new values have 4 elements
matthewgilbert commented 6 years ago

This is a use case I have never come across. Looks like parsing is being done correctly but erroring on the instantiation. I'm marking this as a bug since this should handle this scenario.

richardtbenade commented 6 years ago

Many thanks. Any rough indication how long such an update will take? Kind regards

Richardt Benade

matthewgilbert commented 6 years ago

The fix is relatively straightforward so mostly just around timing when I will have a chance to work on it. Likely sometime next week.

matthewgilbert commented 6 years ago

Here are some examples further detailing the issue

import pdblp

con = pdblp.BCon(debug=True, port=8194)
con.start()

df = con.bdh(
    tickers=['AAPL US Equity', '1437355D US Equity'],
    flds=['PX_LAST', 'VOLUME'],
    start_date='20180510',
    end_date='20180511',
    longdata=False
)
df
pdblp.pdblp:INFO:Sending Request:
 HistoricalDataRequest = {
    securities[] = {
        "AAPL US Equity", "1437355D US Equity"
    }
    fields[] = {
        "PX_LAST", "VOLUME"
    }
    startDate = "20180510"
    endDate = "20180511"
    overrides[] = {
    }
}

pdblp.pdblp:INFO:Message Received:
 HistoricalDataResponse = {
    securityData = {
        security = "AAPL US Equity"
        eidData[] = {
        }
        sequenceNumber = 0
        fieldExceptions[] = {
        }
        fieldData[] = {
            fieldData = {
                date = 2018-05-10
                PX_LAST = 190.040000
                VOLUME = 27989289.000000
            }
            fieldData = {
                date = 2018-05-11
                PX_LAST = 188.590000
                VOLUME = 26212221.000000
            }
        }
    }
}

pdblp.pdblp:INFO:Message Received:
 HistoricalDataResponse = {
    securityData = {
        security = "1437355D US Equity"
        eidData[] = {
        }
        sequenceNumber = 1
        fieldExceptions[] = {
        }
        fieldData[] = {
        }
    }
}

ticker     AAPL US Equity            
field             PX_LAST      VOLUME
date                                 
2018-05-10         190.04  27989289.0
2018-05-11         188.59  26212221.0
con.debug = True
df = con.bdh(
    tickers=['AAPL US Equity', '1437355D US Equity'],
    flds=['PX_LAST', 'VOLUME'],
    start_date='20180510',
    end_date='20180511',
    longdata=True
)
df
        date          ticker    field        value
0 2018-05-10  AAPL US Equity  PX_LAST       190.04
1 2018-05-10  AAPL US Equity   VOLUME  27989289.00
2 2018-05-11  AAPL US Equity  PX_LAST       188.59
3 2018-05-11  AAPL US Equity   VOLUME  26212221.00

Example of something that stopped trading

df = con.bdh(
    tickers=['XIV US Equity'],
    flds=['PX_LAST'],
    start_date='20180216',
    end_date='20180216',
    longdata=False
)
ValueError: Length mismatch: Expected axis has 0 elements, new values have 4 elements

vs.

df = con.bdh(
    tickers=['XIV US Equity', 'AAPL US Equity'],
    flds=['PX_LAST'],
    start_date='20180214',
    end_date='20180216',
    longdata=False
)
df
ticker     XIV US Equity AAPL US Equity
field            PX_LAST        PX_LAST
date                                   
2018-02-14          5.93         167.37
2018-02-15          6.04         172.99
2018-02-16           NaN         172.43
richardtbenade commented 6 years ago

That would be great, thanks Matthew.

matthewgilbert commented 6 years ago

@richardtbenade You can get this from the above commit. This is not yet in a release but installing from GitHub should resolve your issue.

richardtbenade commented 6 years ago

Many thanks @matthewgilbert

richardtbenade commented 6 years ago

@matthewgilbert I've installed from Github in a new virtual environment using "pip install git+https://github.com/matthewgilbert/pdblp.git".

This ran successfully (terminal output: "successfully installed numpy-1.14.5 pandas-0.23.3 pdblp-0.1.5 python-dateutil-2.7.3 pytz-2018.5 six-1.11.0").

However, when I run the code I originally posted as an example, I now get:

"C:\venv\test2\Scripts\python.exe C:/repos/de_003/scratchpad_bloomberg_api_raw.py Traceback (most recent call last): File "C:/repos/de_003/scratchpad_bloomberg_api_raw.py", line 66, in con.start() File "C:\venv\test2\lib\site-packages\pdblp\pdblp.py", line 96, in start logger = _get_logger(self.debug) File "C:\venv\test2\lib\site-packages\pdblp\pdblp.py", line 11, in _get_logger if (logger.parent is not None) and logger.parent.hasHandlers(): AttributeError: 'RootLogger' object has no attribute 'hasHandlers'"

Please advise?

matthewgilbert commented 6 years ago

It appears the logging module in python 2.7 does not have a hasHandlers() method. This should work if you upgrade to python 3.x, and the plan going forward for pdblp is to only support 3.x versions of python.

richardtbenade commented 6 years ago

Thanks Matthew

djusteiner commented 5 years ago

Hi Matthew - first of all thanks a lot for your work, think it's awesome and very useful.

I was wondering if it was possible - along the line of what you've done above for a de-listed securities - to have an an option when doing a bdh query across multiple tickers that fills NaN for any errors rather than stopping the whole query?

I'm parsing a query across a large set of securities where I have a few invalid securities. This then stops the query as a whole and I'd love to have a NaN instead rather than the query failing as a whole. The way, I have gone around it at the moment is by looping over each securities with an error handler but the performance is quite slower.

matthewgilbert commented 5 years ago

@djusteiner I've generally avoided filling in bad user data with NaNs since I have not found a compelling use case.

https://github.com/matthewgilbert/pdblp/issues/31 has discussed a similar use case for when such a need would arise. But as the reporter remarks, this can be handled by first making a query to get the valid instrument list.

https://github.com/matthewgilbert/pdblp/issues/13 discusses the rationale for when fields will return NaN vs when they will raise an error.

I have avoided carte blanche handling bad user data and always returning NaNs because it starts to muddle the interpretation of what NaN represents.

What is the use case you have in mind? The use cases I have come across where the ticker list is unknown, such as futures or options, you can usually make a query to first get the list of instruments (e.g. con.bulkref('CL1 Comdty', 'FUT_CHAIN')) and then use this list to query appropriately.

djusteiner commented 5 years ago

@matthewgilbert my use case is securities that are not necessarily setup in Bloomberg or when I don't know before hand whether the ISIN should be followed by Corp or Mtge. The former case happens for instance on some claims of a famous defaulted bank (in oct-2008) that you can find ISIN on through simple google search but nothing setup in Bloomberg. #31 can work indeed, however as you mentioned not the best from a data cap point of view.

matthewgilbert commented 5 years ago

@djusteiner Interesting, this is a use case I have not seen before. It would be helpful to open a new issue describing the problem and use case, also maybe add a minimal example. Not sure what the best way to handle this would be but seeing a clear example would be helpful.

djusteiner commented 5 years ago

@matthewgilbert sure will do. Thanks a lot for your quick responsiveness in any case.