Closed avantgardeam closed 1 year ago
Hi Everyone,
Awesome work! This has been light years better than the stock Python API or sandboxed excel add-in.
I'm still very new to both using BBG and Python, and have no one to speak about this niche within a niche.
If one of you could lightly explain the differences between the blp and xbbg wrappers that would be much appreciated.
Looking forward to trying out bql.
Thanks again.
Hi Everyone,
Awesome work! This has been light years better than the stock Python API or sandboxed excel add-in.
I'm still very new to both using BBG and Python, and have no one to speak about this niche within a niche.
If one of you could lightly explain the differences between the blp and xbbg wrappers that would be much appreciated.
Looking forward to trying out bql.
Thanks again.
Hello,
Thanks for the enthusiasm!
@matthewgilbert can better elaborate on this, but in short, I would say that blp
's architecture is more robust and its error handling is clearer than xbbg
. However, if you're used to Excel, you might find xbbg
's syntax more familiar. By the way, for questions like these, it's better to open an Issue rather than commenting on a PR, especially when it's not directly related to the PR.
Regarding BQL support, the PR hasn't been merged into the main blp
branch yet. It'd be best to wait for that before diving into this functionality.
Keep in mind, this is an undocumented API, so we can't make any hard promises about its behavior. Therefore, it's a good idea to stick with the usual methods (bdp
, bdh
, etc.) when you can.
Thanks @avantgardeam, will try to look at this over the next week and leave feedback.
Thanks @avantgardeam, will try to look at this over the next week and leave feedback.
Hello, @matthewgilbert! Have you had any time to check this?
Thanks!
Thanks @avantgardeam, will try to look at this over the next week and leave feedback.
Hello, @matthewgilbert! Have you had any time to check this?
Thanks!
Apologies for the delay. Mostly looks good, just left a few comments
Thanks @avantgardeam, will try to look at this over the next week and leave feedback.
Hello, @matthewgilbert! Have you had any time to check this? Thanks!
Apologies for the delay. Mostly looks good, just left a few comments
Thanks a lot for the review!
A somewhat unrelated question: don't you think that it would be better to hide the methods that collect data? _collect_to_bdh
instead of collect_to_bdh
, for example.
A somewhat unrelated question: don't you think that it would be better to hide the methods that collect data?
_collect_to_bdh
instead ofcollect_to_bdh
, for example.
Yes that probably makes sense. I can't recall if I original had a use case for calling these independently but if I did I can't recall it now.
Merged, thanks @avantgardeam !
One question looking at the example data returned
security field secondary_name secondary_value value
0 AAPL US Equity px_last() CURRENCY USD 192.755005
1 IBM US Equity px_last() CURRENCY USD 139.289993
2 AAPL US Equity px_last() DATE 2023-07-24T00:00:00Z 192.755005
3 IBM US Equity px_last() DATE 2023-07-24T00:00:00Z 139.289993
It seems like there is some duplication of data that is caused? Do you have an example of what the raw responses
passed into collect_to_bql
look like?
One question looking at the example data returned
security field secondary_name secondary_value value 0 AAPL US Equity px_last() CURRENCY USD 192.755005 1 IBM US Equity px_last() CURRENCY USD 139.289993 2 AAPL US Equity px_last() DATE 2023-07-24T00:00:00Z 192.755005 3 IBM US Equity px_last() DATE 2023-07-24T00:00:00Z 139.289993
It seems like there is some duplication of data that is caused? Do you have an example of what the raw
responses
passed intocollect_to_bql
look like?
This is the response for the query in the docstring:
{
'field': 'px_last()',
'id': ['AAPL US Equity', 'IBM US Equity'],
'value': [193.22, 142.97],
'DATE': ['2023-07-28T00:00:00Z', '2023-07-28T00:00:00Z'],
'CURRENCY': ['USD', 'USD']
}
One could propose that this data be presented more efficiently by simply transforming each list into a column. But, this alternative does have significant drawbacks. For instance, let's consider the query below:
bquery = BlpQuery().start()
bquery.bql(expression="get(px_to_book_ratio(as_of_date=range(2021-01-01, 2023-06-30, frq=CY)), gics_sector_name()) for('AAPL US Equity')")
The resulting response from the parser (which yields a field at a time) looks like this:
{
'field': 'px_to_book_ratio(as_of_date=range(2021-01-01,2023-06-30,frq=CY))',
'id': ['AAPL US Equity', 'AAPL US Equity', 'AAPL US Equity', 'AAPL US Equity'],
'value': ['NaN', 46.234021081312406, 'NaN', 49.06639630972683],
'REVISION_DATE': ['2020-10-30T00:00:00Z', '2021-10-29T00:00:00Z', '2022-10-28T00:00:00Z', '2023-05-05T00:00:00Z'],
'AS_OF_DATE': ['2021-01-01T00:00:00Z', '2021-12-31T00:00:00Z', '2022-12-31T00:00:00Z', '2023-06-30T00:00:00Z'],
'PERIOD_END_DATE': ['2020-09-26T00:00:00Z', '2021-09-25T00:00:00Z', '2022-09-24T00:00:00Z', '2023-04-01T00:00:00Z']
}
{
'field': 'gics_sector_name()',
'id': ['AAPL US Equity'],
'value': ['Information Technology']
}
Here, we encounter two fields returning notably diverse responses. To my mind, there isn't a straightforward method to deliver this data to the user without converting these two tables from wide to long format to standardize their structure.
The crux of this explanation is that the appearance of duplicates is a necessary side effect of table pivoting. Admittedly, this is a substantial disadvantage. Nonetheless, as this is an undocumented API, I've yet to identify a superior method for data delivery that ensures no information is lost.
If you have any ideas, I would be happy to implement them!
A somewhat unrelated question: don't you think that it would be better to hide the methods that collect data?
_collect_to_bdh
instead ofcollect_to_bdh
, for example.Yes that probably makes sense. I can't recall if I original had a use case for calling these independently but if I did I can't recall it now.
What are your thoughts on a simple PR to hide these methods?
If you have any ideas, I would be happy to implement them!
I don't have a strong opinion since I'm not actively using this feature (or the library more generally). It strikes me that this problem stems for the fact that the data is inherently non tabular, so I think your approach seems fine.
A somewhat unrelated question: don't you think that it would be better to hide the methods that collect data?
_collect_to_bdh
instead ofcollect_to_bdh
, for example.Yes that probably makes sense. I can't recall if I original had a use case for calling these independently but if I did I can't recall it now.
What are your thoughts on a simple PR to hide these methods?
I don't have much bandwidth for maintenance these days so for more minor cleanups like this I don't have much appetite since I don't want people to waste their time submitting PRs that I don't end up reviewing.
A somewhat unrelated question: don't you think that it would be better to hide the methods that collect data?
_collect_to_bdh
instead ofcollect_to_bdh
, for example.Yes that probably makes sense. I can't recall if I original had a use case for calling these independently but if I did I can't recall it now.
What are your thoughts on a simple PR to hide these methods?
I don't have much bandwidth for maintenance these days so for more minor cleanups like this I don't have much appetite since I don't want people to waste their time submitting PRs that I don't end up reviewing.
I understand! Thanks a lot for reviewing this PR.
Is it possible to release a new version containing the changes in pypi?
@gncosta this should be released now.
Tks! On 22 Oct 2023, at 13:19, Matthew Gilbert @.***> wrote: @gncosta this should be released now.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>
Hello!
In line with our discussions on issue #18, I've opened this PR that brings in BQL query support.
I've had to tweak
BlpParser
due to the unique nature of this endpoint, adding a few new methods to manage its different behavior.Since there's no formal documentation for this endpoint, I decided to pass through all the data we get from Bloomberg - ensuring we don't miss anything important.
Because the response might vary per query (like optional secondary columns), I melted some dataframes. This ensures that we can safely aggregate different fields, catering for varied response formats.
It would be great to hear any feedback on this!
Thanks, @DS-London and @matthewgilbert!