Rblp / Rblpapi

R package interfacing the Bloomberg API from https://www.bloomberglabs.com/api/
Other
167 stars 75 forks source link

bdh function shuffles names for multiple securities #101

Closed pgarnry closed 8 years ago

pgarnry commented 8 years ago

First of all, I would like to say that as a former user of the Rbbg package, the new Rblpapi package is the best thing that has happened in years to the financial community working in R. Great thanks to the hardworking team behind this package and especially eddelbuettel and armstrtw!

Now to the issue at hand...

I found a strange bug which I believe may happens inside the "bdh_Impl" function within the bdh function. The order of the bdh output list does not always match the order of the securities vector. See examples below.

tickers <- c("000830 KS Equity", "003550 KS Equity", "005490 KS Equity",
             "005930 KS Equity", "006400 KS Equity", "0111145D US Equity",
             "0132533D CN Equity", "015760 KS Equity")

option.fields <- c("periodicitySelection", "nonTradingDayFillOption",
                   "nonTradingDayFillMethod", "periodicityAdjustment",
                   "adjustmentFollowDPDF", "pricingOption", "currency")

option.values <- c("MONTHLY", "NON_TRADING_WEEKDAYS",
"NIL_VALUE", "CALENDAR", "TRUE", "PRICING_OPTION_PRICE", "USD")

bbg.options <- structure(option.values, names = option.fields)

out1 <- bdh(securities = tickers,
            fields = "TOT_RETURN_INDEX_GROSS_DVDS",
            start.date = as.Date("1996-01-01"),
            end.date = "",
            options = bbg.options,
            overrides = NULL)

tickers
names(out1)
# the order of both character vectors match

The screenshot below shows that the order of the two vectors is the same

image

Now if we add another ticker the bdh shuffles the output so the order is no longer the same.

new.tickers <- c("000830 KS Equity", "003550 KS Equity", "005490 KS Equity",
                 "005930 KS Equity", "006400 KS Equity", "0111145D US Equity",
                 "0132533D CN Equity", "015760 KS Equity", "0202445Q US Equity")

option.fields <- c("periodicitySelection", "nonTradingDayFillOption",
                   "nonTradingDayFillMethod", "periodicityAdjustment",
                   "adjustmentFollowDPDF", "pricingOption", "currency")

option.values <- c("MONTHLY", "NON_TRADING_WEEKDAYS",
"NIL_VALUE", "CALENDAR", "TRUE", "PRICING_OPTION_PRICE", "USD")

bbg.options <- structure(option.values, names = option.fields)

out2 <- bdh(securities = new.tickers,
            fields = "TOT_RETURN_INDEX_GROSS_DVDS",
            start.date = as.Date("1996-01-01"),
            end.date = "",
            options = bbg.options,
            overrides = NULL)

new.tickers
names(out2)
# the order of both character vectors does no longer match

As the screenshot below shows the last ticker added is suddenly the first element in the list output from Bloomberg.

image

Is this unintended or a deliberate output choice?

The issue can easily be fixed after the bdh call by matching one's original ticker vector against the names vector on the output list to ensure that data get where it belong if one, as I do, want to manipulate the data into another data structure, than the Bloomberg output list, that fits better with one's data analysis desires.

If the authors can easily fix this it would be much appreciated. If not, I live without the fix and this issue thread will simply act as a help to future users of Rblpapi, if they run into the same issue, which they will likely do.

armstrtw commented 8 years ago

The out of order response is simply the order in which bbg sends the response.

It would most likely be a simple fix to re-order the data, but at a speed cost which is unlikely to be important to most people, but potentially important to some...

pgarnry commented 8 years ago

It is just strange because the former Rbbg package did not shuffle the output order. So either Bloomberg changed something in their API or something changed in the Rblpapi.

But I agree with your comment about that a fix may only help a few, which is also why it should just be fixed by the user after the Bloomberg query.

However, I still think it is useful to flag this as an issue as I doubt many users of Rblpapi would expect the order of the output list to be different that the order of the securities vector. But now they know.

You can close this thread if you want...

johnlaing commented 8 years ago

Nothing has changed in the Bloomberg API, the difference is in how the packages process the result. Rblpapi constructs its value "on the fly" as data comes back, where Rbbg builds a container first and then populates data into it. Part of building the container before any data has actually come back is taking a view on how it should be structured, including the notion that securities should be returned in the order they are passed. So to be clear: this is a nicety invented and enforced by Rbbg, completely distinct from the Bloomberg API itself.

That said, it's still nice, and it would be just as nice if Rblpapi did the same, assuming it could be accomplished without a substantial speed hit. I think this is a legitimate issue and vote for leaving it open.

pgarnry commented 8 years ago

Thanks for the reply John.

Then it makes sense why there has been a significant speed increase in Rblpapi on bdh function calls for a long vector of tickers.

eddelbuettel commented 8 years ago

I fail to see how this is different from #73 where @pgarnry already complained loud and clear that the output was not to his liking,

It is the same here. It is still open source. If you don't like it, change it, or pay someone to change it.

I suggest to close this. The code works as designed. Putting a 'resorter' over it (even after the fact) should be fairly trivial.

pgarnry commented 8 years ago

eddelbuettel...I feel a negative tone here. Correct me if I am wrong. If so, I have clearly not been precise in my comment on #73

I write that Nick's issue can indeed be solved as you say, just that doing a bdh function call in each step of a loop and solve the issue on a security level is much slower than getting the whole output from one bdh function call on the whole vector of tickers and then manipulate the data to one's liking afterwards.

And it is the same here. I was surprised to see the shuffle of output order, because it did not happen in the former Rbbg package, as confirmed by John, which this package was meant to succeed. It was by accident that I could see the output order was different. Others might be wondering about this shuffling, especially if you do cross-sectional equity research as I do. So this thread is meant as a discussion to reveal the cause, which John just did, and to illuminate the issue for other users.

I actually do not care that much about the output from the bds, bdh and bdp etc. because it can easily be manipulated afterwards to the structure one desire.

Just to be clear. I am not here to make critique. In fact I think you guys have done an excellent job on this package, as also stated in the beginning of this thread. Keep up the good work!

eddelbuettel commented 8 years ago

can indeed be solved as you say, just that doing a bdh function call in each step of a loop

What I am really suggesting is to add a loop layer when the request is made. Looping at the R level is a very poor substitute only you suggest.

I actually do not care that much about the output from the bds, bdh and bdp etc. because it can easily be manipulated afterwards to the structure one desire.

Good. So no bug here.