Question about failure to converge to a fixed point and speed in two-nest RCNL

Hi Jeff,

First, I want to thank you for the great package!

I have a question about my two-nest RCNL estimation. I encountered the following two warnings:

At least one error was encountered. As long as the optimization routine does not get stuck at values of theta that give rise to errors, this is not necessarily a problem. If the errors persist or seem to be impacting the optimization results, consider setting an error punishment or following any of the other suggestions below: The fixed point computation of delta failed to converge. This problem can sometimes be mitigated by increasing the maximum number of fixed point iterations, increasing the fixed point tolerance, choosing more reasonable initial parameter values, setting more conservative parameter or share bounds, or using different iteration or optimization configurations. and Encountered a numerical error when computing delta. This problem is often due to prior problems, overflow, or nonpositive shares, and can sometimes be mitigated by choosing smaller initial parameter values, setting more conservative bounds on parameters or shares, rescaling data, removing outliers, changing the floating point precision, or using different optimization, iteration, or integration configurations. Errors encountered: invalid value, overflow.

I have read through previous issues and tried the solutions you suggested in those issues: I have used “trust-constr”and “bfgs”for Optimization, “squarem”and “lm”for iteration, different starting values, different tolerance, and different number for max iteration/evaluation. But the program still displays the warning above.

The number of observations is a little over 400000, it can take 3 weeks or more to finish the estimation although I used the parallel option. The estimated results look weird probably due to the failure to converge. Sometimes the estimated price coefficient can be as high as 1e12. Below please find my code, which works fine for one-nest RCNL and simple Nested Logit.

I wonder if I mess up with something and if there is any solution to deal with such situation. Thank you in advance for any suggestions! ` df1 = product_data_two_nest.copy()

df1['market_ids']= [str(x) + '-' + str(y) for x, y in zip(df1['region'],df1['tt'])] df1['shares']= list(df1['share']) df1['prices']=list(df1['price'])

df1['clustering_ids'] = list(df1['region])

df1['demand_instruments0']=list(df1['iv1']) df1['demand_instruments1']=list(df1['iv2']) df1['demand_instruments2']=list(df1['iv3']) df1['demand_instruments3']=list(df1['iv4']) df1['demand_instruments4']=list(df1['iv5']) df1['demand_instruments5']=list(df1['iv6']) df1['demand_instruments6']=list(df1['iv7']) df1['demand_instruments7']=list(df1['iv8']) df1['demand_instruments8']=list(df1['iv9']) df1['demand_instruments9']=list(df1['iv10']) df1['demand_instruments10']=list(df1['iv11']) df1['demand_instruments11']=list(df1['iv12']) df1['demand_instruments12']=list(df1['iv13']) df1['demand_instruments13']=list(df1['iv14']) df1['demand_instruments14']=list(df1['iv15'])

df1['nesting_ids'] = df1['luxury']

agent_data = demean_demo.copy() agent_data['weights'] = 1/500

for i in range(3): exec("agent_data['nodes{}'] = 0".format(i))

def solve_rcnl(df,agent_data): X1_formulation = pyblp.Formulation('0 + prices', absorb='C(prodID) + C(tt)') X2_formulation = pyblp.Formulation('1 + prices + calories') product_formulations = (X1_formulation, X2_formulation) agent_formulation = pyblp.Formulation('0 + demeaned_realincome') mw_problem = pyblp.Problem(product_formulations, df,agent_formulation,agent_data, add_exogenous=True) initial_sigma = np.zeros((3,3)) initial_pi = np.array([[ 0.016], [0.001], [0.005]]) accelerated_fix_point_iteration = pyblp.Iteration('lm', {'max_evaluations': 3000,'atol':1e-14}) tighter_bfgs = pyblp.Optimization('trust-constr', {'gtol': 1e-6,'maxiter':5000}) return mw_problem.solve(sigma = initial_sigma, pi = initial_pi, method = '2s', optimization=tighter_bfgs, iteration = accelerated_fix_point_iteration, scale_objective=False, finite_differences=True, delta_behavior='last', W_type='clustered', se_type='clustered', rho=[ 0.8,0.7], center_moments = False)

df1 = df1.sort_values(by=['market_ids','prodID']) pyblp.options.verbose = True

with pyblp.parallel(12): rcnl_results2 = solve_rcnl(df1,agent_data) `

I recommend starting with a smaller dataset until you get all the numerical errors sorted out, e.g. starting with just a couple markets or a smaller number of products per market. Three weeks isn't a great amount of time for debugging something.

Sometimes, 1e-14 is too tight a tolerance. For example, delta might be of a very extreme scale for some products, so you might need something like 1e-12 instead. To get a sense for this, you can enable universal_display=True in your Iteration configuration to look iteration-by-iteration what's happening.

Another tip is to not do parallel processing when debugging things. It can make displayed information strange (since you have different processes printing stuff at the same time). In general, we expect parallel processing to help a lot when doing market-specific stuff (e.g. solving the contraction) takes a very long time compared to everything else. Otherwise there are severe overhead costs from passing data between processes.

Thank you for the suggestion! I will test the code on a smaller scale and let you know if I find something.

I tried the estimation as suggested on 10 markets. I still can not get the estimation after several hours. It got stuck at the iteration part. I post my code in the end and put the 10-market data in the attachment. Would you mind taking a quick look? Thank you!

According to the display of each iteration, I found it confusing that even after the iteration converged below the tolerance for one market, the iteration for the same market will be conducted once again, as the iteration display shown below for market id ="35.0-240.0". How should I interpret "clipped share"? And another silly question is why the iteration is conducted market by market rather than across all markets at once.

Market   | Contraction Iterations   | ContractionEvaluations  Clipped Shares |   Delta Max Norm |  Max Norm Improvement
----------  -----------  -----------  -------  ----------  -----------
35.0-240.0    |  3050          |  3051         |  0     |  +1.816E-06             
35.0-240.0    |   3051          |  3052        |   0    |   +1.232E-06             
35.0-240.0    |   3052         |  3053        |   0    |   +5.684E-13 |   +5.122E-09

  Market   | Contraction Iterations   | ContractionEvaluations  Clipped Shares |   Delta Max Norm |  Max Norm Improvement
----------  -----------  -----------  -------  ----------  -----------
35.0-240.0    |      0       |       1     |      15  |     +1.385E+02             
35.0-240.0    |     1        |      2    |       15   |    +1.385E+02             
35.0-240.0    |     2       |       3     |      15  |     +1.385E+02             
35.0-240.0    |     3       |       4    |       15  |     +1.385E+02

My code:

small_df1 = pd.read_csv(workingdata + "10_mkt_two_nest_weekly_productdata.csv")
small_agent = pd.read_csv(workingdata + "10_mkt_agentdata.csv")

def solve_rcnl(df,agent):
    X1_formulation = pyblp.Formulation('0 + prices', absorb='C(prodfe) + C(timetrend)')
    X2_formulation = pyblp.Formulation('1 + prices + calories_144')
    product_formulations = (X1_formulation, X2_formulation)
    agent_formulation = pyblp.Formulation('0 + demeaned_realinc') # use demeaned real income
    mw_problem = pyblp.Problem(product_formulations, df,agent_formulation,agent, add_exogenous=True)
    initial_sigma = np.zeros((3,3))
    initial_pi = np.array([[ 0.016],  [0.001], [0.005]])
    # intial_delta_from_twonest_NL = nl_results2.compute_delta()
    accelerated_fix_point_iteration = pyblp.Iteration('lm', {'max_evaluations': 3000,'atol':1e-12},universal_display=True)
    tighter_bfgs = pyblp.Optimization('trust-constr', {'gtol': 1e-6,'maxiter':5000})
    return mw_problem.solve(sigma = initial_sigma,
                            pi = initial_pi, # delta = intial_delta_from_twonest_NL,
                            method = '2s',
                            optimization=tighter_bfgs,
                            iteration = accelerated_fix_point_iteration,
                            scale_objective=False,
                            finite_differences=True,
                            delta_behavior='last',
                            W_type='clustered',
                            se_type='clustered',
                            rho=[ 0.8,0.7],
                           center_moments = False)

small_df1 = small_df1.sort_values(by=['market_ids','prodfe']) 
pyblp.options.verbose = True

rcnl_results2 = solve_rcnl(small_df1,small_agent)

10_mkt_agentdata.csv 10_mkt_two_nest_weekly_productdata.csv

I don't have the time to go over this fully, but I can answer some of your questions and hopefully point you in the right direction!

Iteration for a market will happen every time the objective function is evaluated. For example, since you're using finite differences (why?), every time the optimizer needs the objective and gradient, it will compute the objective 1 + 2P times where P is the number of nonlinear parameters. Iteration will happen for every market for every one of these objective evaluations.

See the discussion under shares_bounds here for how you should interpret clipped shares. Usually, these indicate an issue with your data leading to really extreme delta values (e.g. a price or two that's way larger than it should be).

Iteration is conducted market-by-market because of the following. Let's say you have two markets, t = 1, 2. In t = 1, it takes 3 iterations to converge. In t = 2, it takes 3,000 iterations to converge. If iteration is done market-by-market, that's a total of 3,003 iterations, and each iteration costs something that scales with J, the number of products per market. If iteration is done stacking markets, that's a total of 3,000 iterations, but each iteration costs something that scales with 2 * J, so it's about twice as expensive in this example.

My main recommendation is to keep dropping bells and whistles (markets, fixed effects, nesting, accelerated fixed point, etc.) until you figure out what's creating the non-convergence issue. It's hard to tell what's going on when there's a lot of stuff happening and a lot of things being printed to the screen.

Will do. Thank you for the detailed suggestion!

Hope you got this sorted! I'm going to close this for now, but please feel free to re-open/keep commenting.

jeffgortmaker / pyblp

Question about failure to converge to a fixed point and speed in two-nest RCNL #127