Closed spyamine closed 4 years ago
thanks for using the package.
not sure if this is within the scope of this package but can you share your code pls? the bbg response data is hard to manage on API level. one way to overcome this kind of issue is 1) to avoid requesting too much data at a time, and 2) save data to local temporarily and combine them before doing analysis.
Hi, thanks you for your response.
I've managed to solve the problem in the meantime. I've used joblib instead of a plain loop. Here is my initial:
def get_bloomberg_EOD(symbols=None,adjusted=False,assets=None,start=datetime.datetime.strptime("1970-01-01", "%Y-%m-%d").date(),end=datetime.date.today() + datetime.timedelta(-1)):
startString = start.strftime("%Y-%m-%d")
endString = end.strftime("%Y-%m-%d")
print ("getting the EOD data from {} to {} ".format(startString,endString))
classes_symbols = [(s.split(" ")[-1].capitalize(),s) for s in symbols]
dict_1 = {}
for classes, symbols in classes_symbols:
dict_1.setdefault(classes, []).append(symbols)
# print (dict_1)
classes = list(dict_1.keys())
# print(classes)
EOD_list = []
for clasz in classes:
print ("==== " * 4)
print ("working on '{}' assets, {} symbols".format(clasz,len(dict_1[clasz])) )
# getting and storing the data to the EOD library
fields = EOD_FIELDS[clasz]
symbols_class = dict_1[clasz]
print (symbols_class)
# pool = Pool()
for tickers in chunks(symbols_class, BLOOMBERG_TICKERS_LIMITATION - 1):
if adjusted == False:
data = blp.bdh(tickers, flds=fields,start_date=startString,end_date=endString,log='debug')
EOD_list.append(data)
else:
data = blp.bdh(tickers, flds=fields,start_date=startString,end_date=endString,adjust='-')
EOD_list.append(data)
EOD = pd.concat(EOD_list)
print (EOD.empty)
EODs = dataFrameSpliter(EOD)
print ("len(EODs): {}".format(len(EODs)))
return EODs
The new code is the following:
def get_bloomberg_EOD_paralleled(symbols=None,adjusted=False,assets=None,start=datetime.datetime.strptime("1970-01-01", "%Y-%m-%d").date(),end=datetime.date.today() + datetime.timedelta(-1)):
startString = start.strftime("%Y-%m-%d")
endString = end.strftime("%Y-%m-%d")
print ("getting the EOD data from {} to {} ".format(startString,endString))
classes_symbols = [(s.split(" ")[-1].capitalize(),s) for s in symbols]
dict_1 = {}
for classes, symbols in classes_symbols:
dict_1.setdefault(classes, []).append(symbols)
classes = list(dict_1.keys())
EOD_list = []
for clasz in classes:
print ("==== " * 4)
print ("working on '{}' assets, {} symbols".format(clasz,len(dict_1[clasz])) )
# getting and storing the data to the EOD library
fields = EOD_FIELDS[clasz]
symbols_class = dict_1[clasz]
print (symbols_class)
results = Parallel(n_jobs=2)(delayed(get_historical)(tickers=tickers,startString=startString,endString= endString,fields= fields ,adjusted = False) for tickers in chunks(symbols_class, BLOOMBERG_TICKERS_LIMITATION - 1) )
EOD_list = EOD_list + results
print (EOD_list)
print (len(EOD_list))
def get_historical(tickers,startString,endString ,fields,adjusted ):
if adjusted == False:
data = blp.bdh(tickers, flds=fields,start_date=startString,end_date=endString,log='debug')
return data
else:
data = blp.bdh(tickers, flds=fields,start_date=startString,end_date=endString,adjust='-')
return data
Right, but sending requests to bbg simultaneously is less preferred because we’re getting response data thru one portal. Not sure if the dataframe you get will be stable.
so what do you suggest as a solution. I have around 1500 tickers to query EOD data and other things ?
Actually the joblib did not solve the problem.
Actually we don't need it!! in xbbg\core\process.py the function rec_events has a timeout condition set at 10 * 500 ms. and depending on the size of the data to get it exits before continuing fectching all the data.
We have two solutions: either to remove the timeout part or to set it at a bigger time span.
I don't know if i was clear. I'm not a programmer, I'm a fund manager so doing my best to be clear on how to share the ideas in a proper way
https://github.com/pyxll/blpapi-python/blob/master/examples/SimpleHistoryExample.py
this is the link to to the vanilla history request. there is no timeout.
I was dreaming about the problem!! I did not want to use my old program. yours is far more intuitive!!
Cheers for the good work !!
I just realized that i had the timeout problem because I'm sitting in a room far from the router!! lol
if you send multiple requests simultaneously, when bbg returns data, it may cause problems - even after you increase the timeout. please avoid using threads or parallel looping of any sort when possible.
for 1500 names, you can query 100 names at a time and save them locally. then operate on local files thereafter. bbg did lots of optimizations from their end. if you make 15 queries one by one, it would probably be faster then using joblib
Thank you for your advice!! this is what i did!!
Hi,
Thank you for your package, Excellent work!!
I use your blp.bdh to fetch the data from bloomberg. I use it in the middle of a loop because i want to get data for more that the bloomberg limit per ticker so i split the data into chunks and run bdh in a loop.
It does end the loop before the program finish getting all the data.
can you please help?