robertmartin8 / PyPortfolioOpt

Financial portfolio optimisation in python, including classical efficient frontier, Black-Litterman, Hierarchical Risk Parity
https://pyportfolioopt.readthedocs.io/
MIT License
4.47k stars 951 forks source link

how to setup with multiprocess? it's too late for single cpu kernel #291

Closed wac81 closed 3 years ago

wac81 commented 3 years ago

how to set up with multiprocess? it's too late for a single CPU kernel

robertmartin8 commented 3 years ago

Hi @wac81,

Can you explain a bit more? I don't think multiprocessing is possible here (except for very specific aspects like monte carlo simulations)

Best, Robert

wac81 commented 3 years ago

but it's too slow for 300 columns and 3000 rows matrix, whether BL model or ricky model

robertmartin8 commented 3 years ago

@wac81

300 columns and 3000 rows should still be fine. Is there any particular function call that is slow?

Could you also share the specs/OS of your machine?

wac81 commented 3 years ago
past_rets = expected_returns.mean_historical_return(past_df)

# past_rets = expected_returns.ema_historical_return(past_df)
past_cov = risk_models.exp_cov(past_df)

ef = EfficientFrontier(past_rets, past_cov)
weights = ef.max_sharpe()  <----this is too long 

cleaned_weights = ef.clean_weights(cutoff=1e-5, rounding=6)
weights = cleaned_weights

print(cleaned_weights)
ef.portfolio_performance(verbose=True)

latest_prices = get_latest_prices(past_df)
print(latest_prices)

da = DiscreteAllocation(weights, latest_prices, total_portfolio_value=portfolio_val)

allocation, leftover = da.lp_portfolio()
print('Discrete allocation:', allocation)
print('Funds Remaining: $', leftover)

=------------------------------------------------------------------------------ os: ubuntu 1804 memory: 32gb cpu: intel 8700

wac81 commented 3 years ago
bl = BlackLittermanModel(S, pi="market", market_caps=mcaps, risk_aversion=delta,
                         absolute_views=viewdict)

# Posterior estimate of returns
ret_bl = bl.bl_returns()

S_bl = bl.bl_cov()

ef = EfficientFrontier(ret_bl, S_bl)
ef.add_objective(objective_functions.L2_reg)
ef.max_sharpe()<-------------------------------------BL model same here, BL more slow 
weights = ef.clean_weights()

print(weights)
ef.portfolio_performance(verbose=True)

da = DiscreteAllocation(weights, past_df.iloc[-1], total_portfolio_value=portfolio_val)
phschiele commented 3 years ago

One workaround could be to find the max_sharpe portfolio by computing n (say, 100) portfolios along the efficient frontier (equidistant in risk or return space) and selecting the portfolio with the maximum sharpe ratio. This could easily be done in a parallelised manner.

wac81 commented 3 years ago

thanks

can you add this feature for multiprocess?

phschiele commented 3 years ago

@wac81 Reflecting on my previous comment, one actually does not need to explicitly set up multiprocessing, as the solver already uses it in the background. Consider this comparison:

import copy
import time
import multiprocessing as mp
from functools import partial

from pypfopt.plotting import _ef_default_returns_range
from tests.utilities_for_tests import setup_efficient_frontier

ef = setup_efficient_frontier()
return_range = _ef_default_returns_range(ef, 1000)

def optimize_single_target(target_return, ef):
    ef_i = copy.deepcopy(ef)
    ef_i.efficient_return(target_return)
    _, sigma, _ = ef_i.portfolio_performance()
    return sigma

# no explicit multiprocessing
start1 = time.time()
[optimize_single_target(t, ef) for t in return_range]
end1 = time.time()

# explicit multiprocessing
start2 = time.time()
with mp.Pool() as pool:
    pool.map(partial(optimize_single_target, ef=ef), return_range)
end2 = time.time()

print(end1-start1, end2-start2)

>> 30.6782 30.7645

As you can see, the timings are almost identical.

There would be one way to further increase the performance, which is using a parameterized cvxpy problem. Perhaps one could even change the API to automatically use a parametrized version when repeatedly calling the efficient_return function. cc @robertmartin8

# setting up parametrized problem
ef._objective = objective_functions.portfolio_variance(
    ef._w, ef.cov_matrix
)
ret = objective_functions.portfolio_return(
    ef._w, ef.expected_returns, negative=False
)
for obj in ef._additional_objectives:
    ef._objective += obj
target_risk = cvxpy.Parameter()
ef._constraints.append(ret >= target_risk)
ef._make_weight_sum_constraint(False)

# solving the parametrized problem for different return targets
start3 = time.time()
for return_value in return_range:
    target_risk.value = return_value
    ef._solve_cvxpy_opt_problem()
end3 = time.time()

print(end3-start3)

>> 6.3820
wac81 commented 3 years ago

thank you , i will try your suggestion