matthewgilbert / pdblp

pandas wrapper for Bloomberg Open API
MIT License
242 stars 67 forks source link

ReferenceDataRequest with various return sizes #35

Closed matthewgilbert closed 6 years ago

matthewgilbert commented 6 years ago

Related to https://github.com/matthewgilbert/pdblp/issues/5, when an array of reference data is returned it is ambiguous which values are grouped together

import pdblp

con = pdblp.BCon(debug=True)
con.start()
df = con.ref("BCOM Index", "INDX_MWEIGHT_HIST")
DEBUG:root:Sending Request:
 ReferenceDataRequest = {
    securities[] = {
        "BCOM Index"
    }
    fields[] = {
        "INDX_MWEIGHT_HIST"
    }
    overrides[] = {
    }
}

DEBUG:root:Message Received:
 ReferenceDataResponse = {
    securityData[] = {
        securityData = {
            security = "BCOM Index"
            eidData[] = {
            }
            fieldExceptions[] = {
            }
            sequenceNumber = 0
            fieldData = {
                INDX_MWEIGHT_HIST[] = {
                    INDX_MWEIGHT_HIST = {
                        Index Member = "BON8"
                        Percent Weight = 2.430000
                    }
                    INDX_MWEIGHT_HIST = {
                        Index Member = "C N8"
                        Percent Weight = 6.680000
                    }
                    INDX_MWEIGHT_HIST = {
                        Index Member = "CLN8"
                        Percent Weight = 7.600000
                    }
                [ommitted]
                }
            }
        }
    }
}
df.head()
       ticker                             field value
0  BCOM Index    INDX_MWEIGHT_HIST:Index Member  BON8
1  BCOM Index  INDX_MWEIGHT_HIST:Percent Weight  2.43
2  BCOM Index    INDX_MWEIGHT_HIST:Index Member  C N8
3  BCOM Index  INDX_MWEIGHT_HIST:Percent Weight  6.68
4  BCOM Index    INDX_MWEIGHT_HIST:Index Member  CLN8

Implicitly these are associated in the order they are returned but it would be better to be explicit.

One complication for implementing a coherent interface is that currently both array data and singleton data can be returned from the same call to ref(), e.g.

con.debug = False
con.ref("BCOM Index", ["INDX_MWEIGHT_HIST", "PERCENT_WEIGHT"]).tail()
        ticker                             field value
40  BCOM Index    INDX_MWEIGHT_HIST:Index Member  W N8
41  BCOM Index  INDX_MWEIGHT_HIST:Percent Weight  3.88
42  BCOM Index    INDX_MWEIGHT_HIST:Index Member  XBN8
43  BCOM Index  INDX_MWEIGHT_HIST:Percent Weight  4.23
44  BCOM Index                    PERCENT_WEIGHT   100
matthewgilbert commented 6 years ago

To deal with this coherently, ref and bulkref have been introduced for dealing with ReferenceDataRequests that return singleton or array data respectively. Examples of these are

import pdblp
con = pdblp.BCon()
con.start()
df = con.bulkref('BCOM Index', 'INDX_MWEIGHT')
df.head()
       ticker         field                             name value  position
0  BCOM Index  INDX_MWEIGHT  Member Ticker and Exchange Code  BON8         0
1  BCOM Index  INDX_MWEIGHT                Percentage Weight  2.41         0
2  BCOM Index  INDX_MWEIGHT  Member Ticker and Exchange Code  C N8         1
3  BCOM Index  INDX_MWEIGHT                Percentage Weight  6.56         1
4  BCOM Index  INDX_MWEIGHT  Member Ticker and Exchange Code  CLN8         2
df2 = con.ref("CL1 Comdty", ["FUT_GEN_MONTH"])
df2
       ticker          field         value
0  CL1 Comdty  FUT_GEN_MONTH  FGHJKMNQUVXZ

Here position is introduced to keep track of the spot in the array this was parsed from.