Closed wac81 closed 3 years ago
Hi @wac81,
Can you explain a bit more? I don't think multiprocessing is possible here (except for very specific aspects like monte carlo simulations)
Best, Robert
but it's too slow for 300 columns and 3000 rows matrix, whether BL model or ricky model
@wac81
300 columns and 3000 rows should still be fine. Is there any particular function call that is slow?
Could you also share the specs/OS of your machine?
past_rets = expected_returns.mean_historical_return(past_df)
# past_rets = expected_returns.ema_historical_return(past_df)
past_cov = risk_models.exp_cov(past_df)
ef = EfficientFrontier(past_rets, past_cov)
weights = ef.max_sharpe() <----this is too long
cleaned_weights = ef.clean_weights(cutoff=1e-5, rounding=6)
weights = cleaned_weights
print(cleaned_weights)
ef.portfolio_performance(verbose=True)
latest_prices = get_latest_prices(past_df)
print(latest_prices)
da = DiscreteAllocation(weights, latest_prices, total_portfolio_value=portfolio_val)
allocation, leftover = da.lp_portfolio()
print('Discrete allocation:', allocation)
print('Funds Remaining: $', leftover)
=------------------------------------------------------------------------------ os: ubuntu 1804 memory: 32gb cpu: intel 8700
bl = BlackLittermanModel(S, pi="market", market_caps=mcaps, risk_aversion=delta,
absolute_views=viewdict)
# Posterior estimate of returns
ret_bl = bl.bl_returns()
S_bl = bl.bl_cov()
ef = EfficientFrontier(ret_bl, S_bl)
ef.add_objective(objective_functions.L2_reg)
ef.max_sharpe()<-------------------------------------BL model same here, BL more slow
weights = ef.clean_weights()
print(weights)
ef.portfolio_performance(verbose=True)
da = DiscreteAllocation(weights, past_df.iloc[-1], total_portfolio_value=portfolio_val)
One workaround could be to find the max_sharpe
portfolio by computing n
(say, 100) portfolios along the efficient frontier (equidistant in risk or return space) and selecting the portfolio with the maximum sharpe ratio. This could easily be done in a parallelised manner.
thanks
can you add this feature for multiprocess?
@wac81 Reflecting on my previous comment, one actually does not need to explicitly set up multiprocessing, as the solver already uses it in the background. Consider this comparison:
import copy
import time
import multiprocessing as mp
from functools import partial
from pypfopt.plotting import _ef_default_returns_range
from tests.utilities_for_tests import setup_efficient_frontier
ef = setup_efficient_frontier()
return_range = _ef_default_returns_range(ef, 1000)
def optimize_single_target(target_return, ef):
ef_i = copy.deepcopy(ef)
ef_i.efficient_return(target_return)
_, sigma, _ = ef_i.portfolio_performance()
return sigma
# no explicit multiprocessing
start1 = time.time()
[optimize_single_target(t, ef) for t in return_range]
end1 = time.time()
# explicit multiprocessing
start2 = time.time()
with mp.Pool() as pool:
pool.map(partial(optimize_single_target, ef=ef), return_range)
end2 = time.time()
print(end1-start1, end2-start2)
>> 30.6782 30.7645
As you can see, the timings are almost identical.
There would be one way to further increase the performance, which is using a parameterized cvxpy problem. Perhaps one could even change the API to automatically use a parametrized version when repeatedly calling the efficient_return
function.
cc @robertmartin8
# setting up parametrized problem
ef._objective = objective_functions.portfolio_variance(
ef._w, ef.cov_matrix
)
ret = objective_functions.portfolio_return(
ef._w, ef.expected_returns, negative=False
)
for obj in ef._additional_objectives:
ef._objective += obj
target_risk = cvxpy.Parameter()
ef._constraints.append(ret >= target_risk)
ef._make_weight_sum_constraint(False)
# solving the parametrized problem for different return targets
start3 = time.time()
for return_value in return_range:
target_risk.value = return_value
ef._solve_cvxpy_opt_problem()
end3 = time.time()
print(end3-start3)
>> 6.3820
thank you , i will try your suggestion
how to set up with multiprocess? it's too late for a single CPU kernel