lballabio / QuantLib

The QuantLib C++ library
http://quantlib.org
Other
5.39k stars 1.79k forks source link

Impossible greek values returned by FD American #779

Closed feribg closed 4 years ago

feribg commented 4 years ago

Here are some screenshots of parameters and the corresponding impossible gamma and vega (very negative) values which makes me question the rest of the greeks as well. I'm attaching the full scheme used to price using those parameters.

class OptionPricer(object):
    def __init__(self):
        self.calc_date = pd.to_datetime('2019-09-10')
        self.calc_date_ql = ql.Date(self.calc_date.day, self.calc_date.month, self.calc_date.year)
        self.settlement = self.calc_date_ql
        self.day_count = ql.Actual365Fixed()
        self.calendar = ql.UnitedStates()
        ql.Settings.instance().evaluationDate = self.calc_date_ql
        period = ql.Period(int(10), ql.Days)
        self.maturity_date = self.calendar.advance(ql.Date(10, 9, 2019), period)
        self.eu_exercise = ql.EuropeanExercise(self.maturity_date)
        self.am_exercise = ql.AmericanExercise(self.settlement, self.maturity_date)
        self.payoff = ql.PlainVanillaPayoff(ql.Option.Put, 110)
        self.am_option = ql.VanillaOption(self.payoff, self.am_exercise)
        self.eu_option = ql.VanillaOption(self.payoff,self.eu_exercise)
        self.spot_quote = ql.SimpleQuote(100)
        self.spot_handle = ql.QuoteHandle(self.spot_quote)
        self.vol_quote = ql.SimpleQuote(0.1)
        self.vol_handle = ql.QuoteHandle(self.vol_quote)
        self.rf_quote = ql.SimpleQuote(0.05)
        self.rf_handle = ql.QuoteHandle(self.rf_quote)
        self.div_quote = ql.SimpleQuote(0.05)
        self.div_handle = ql.QuoteHandle(self.div_quote)
        self.rf_flat_curve = ql.YieldTermStructureHandle(
            ql.FlatForward(self.calc_date_ql, self.rf_handle, self.day_count)
        )
        self.div_flat_curve = ql.YieldTermStructureHandle(
            ql.FlatForward(self.calc_date_ql, self.div_handle, self.day_count)
        )
        self.vol_flat_curve = ql.BlackVolTermStructureHandle(
            ql.BlackConstantVol(self.calc_date_ql, self.calendar, self.vol_handle, self.day_count)
        )
        self.bsm_process = ql.BlackScholesMertonProcess(self.spot_handle, self.div_flat_curve, self.rf_flat_curve, self.vol_flat_curve)
        self.engine = ql.FdBlackScholesVanillaEngine(self.bsm_process)
        self.eu_engine = ql.AnalyticEuropeanEngine(self.bsm_process)

    def numeric_first_order(self,option, quote, h = 0.0001):
        sigma0 = quote.value()
        quote.setValue(sigma0 - h)
        P_minus = option.NPV()
        quote.setValue(sigma0 + h)
        P_plus = option.NPV()
        quote.setValue(sigma0)
        return (P_plus - P_minus) / (2*h)

    def price_and_greeks(self,div_rate, rf_rate, vol, spot_price, strike_price, put_call, dte, market_price=None):
        period = ql.Period(int(dte), ql.Days)
        self.maturity_date = self.calendar.advance(self.calc_date_ql, period)
        self.am_exercise = ql.AmericanExercise(self.settlement, self.maturity_date)

        if put_call:
            self.payoff = ql.PlainVanillaPayoff(ql.Option.Put, strike_price)
        else:
            self.payoff = ql.PlainVanillaPayoff(ql.Option.Call, strike_price)

        T = self.day_count.yearFraction(self.calc_date_ql, self.maturity_date)
        N = max(200, int(1000 * T))
        M = max(100, int(200 *T))

        #self.engine = ql.FdBlackScholesVanillaEngine(self.bsm_process, M, N,int(0.2*M))
        self.engine = ql.FdBlackScholesVanillaEngine(self.bsm_process, M, N, 0, ql.FdmSchemeDesc.ImplicitEuler())
        #self.engine = ql.FdBlackScholesVanillaEngine(self.bsm_process, M, N, 0, ql.FdmSchemeDesc.TrBDF2())

        self.am_option = ql.VanillaOption(self.payoff, self.am_exercise)
        self.am_option.setPricingEngine(self.engine)    
        self.eu_exercise = ql.EuropeanExercise(self.maturity_date)
        self.eu_option = ql.VanillaOption(self.payoff,self.eu_exercise)
        self.eu_option.setPricingEngine(self.eu_engine)
        self.spot_quote.setValue(spot_price)
        self.vol_quote.setValue(vol)
        self.rf_quote.setValue(rf_rate)
        self.div_quote.setValue(div_rate)
        price = self.am_option.NPV()
        delta = self.am_option.delta()
        gamma = self.am_option.gamma()
        theta = self.am_option.theta()
        #if market_price is not None:
        #    print(self.am_option.impliedVolatility(market_price, self.bsm_process))
        #rho = self.numeric_first_order(self.am_option, self.rf_quote)
        vega = self.numeric_first_order(self.am_option, self.vol_quote)/100
        return price, delta, gamma, theta, vega, self.eu_option.NPV(), self.eu_option.delta(),self.eu_option.gamma(), self.eu_option.theta(), self.eu_option.vega()/100 #, rho    
Screen Shot 2020-03-02 at 4 12 23 PM Screen Shot 2020-03-02 at 4 12 08 PM
klausspanderen commented 4 years ago

Hi

it seems that some sort of explicit finite difference scheme is assumed here to calculate the lattice sizes, which defeats the object (see definition of L1, L2 ff) and leads to rather ill-conditioned grids. You are better off by letting QL decide on the lattice details and set

self.engine = ql.FdBlackScholesVanillaEngine(self.bsm_process, 100, 500, 1)

Increase t and x dimension further on if you need high precision results. I'd use a symmetric scheme for the vega calculation

def numeric_first_order(self, option, quote, h=0.0001):
    sigma0 = quote.value()

    quote.setValue(sigma0 - h)
    P_minus = option.NPV()
    quote.setValue(sigma0 + h)
    P_plus = option.NPV()
    quote.setValue(sigma0)
    return (P_plus - P_minus) / (2*h)
feribg commented 4 years ago

Thanks that helped a bit but still negative values about half, im using a 1200x600 grid with the LHS sample in the above code.

klausspanderen commented 4 years ago

Can you please post a parameter example, which gives negative values? 1200 is a large number of time steps, which is normally not necessary as the schemas are order (dt)^2. Thanks.

feribg commented 4 years ago

Here are some examples attached for the parameter space. I've also updated the pricing code in the issue above to include your suggestions for numeric derivatives and a fixed grid size of 200x1000 which is roughly 2x what you suggested symmetric. Is the smaller size a result of the interpolation you're doing, Im struggling to understand how a smaller step would help it converge better?

Here's a concrete example as well:

div_rate      0.012945
rf_rate       0.132976
vol           0.214675
strike        7.672765
put_call      1.000000 #  1= put, 0 = call
dte         601.000000
spot          7.050932
price         0.732451
delta        -0.607794
gamma        -0.774791
theta        -0.036603
vega          2.624307

Archive.zip

Also I'm wondering is there a way to get the matrices from the solver rather than the final answer. Would like to try different approximations of the Solution matrix so would be nice to get both sides of the system rather than just the point answers?

As an aside are day fractions supported, I couldn't find a way to pass a fraction of years not rounded to a day. My sampling space includes negative dividend rates but that shouldnt really be an issue and both negative and positive rates result in wrong gamma / vega.

klausspanderen commented 4 years ago

Hi

first, thanks for the comprehensive example, it makes the discussion much easier. You are pricing American options, I've to admit that I've overseen this in my first response.

QuantLib is using the Crank-Nicolson scheme by default. This scheme is prone to suffer under spurious oscillations, especially for American ATM options (problem in your example). In addition even though the CN-scheme is of order (dt^2) the embedded dynamic programming step for the American condition is only of order (dt), hence more time steps are needed for American options than for European (but I still think as a rule of thumb 200 steps per year should be enough for non-pathological cases.).

Back to the spurious oscillations, I see three solutions here:

  1. If you want to stick to the CN-Scheme use up to 20% damping steps (or so called Rannacher smoothing steps) for American options self.engine = ql.FdBlackScholesVanillaEngine(self.bsm_process, M, N, int(M * 0.2))
  2. Use implicit scheme, which is only of order (dt), hence you might need more time steps but does not exhibit spurious oscillations. self.engine = ql.FdBlackScholesVanillaEngine(self.bsm_process, M, N, 0, ql.FdmSchemeDesc.ImplicitEuler())
  3. Use TrBDF2 scheme (needs QuantLib 1.17). I've seen statements that this scheme is immune to spurious oscillations w/o Rannacher smoothing steps while still of order (dt^2). self.engine = ql.FdBlackScholesVanillaEngine(self.bsm_process, M, N, 0, ql.FdmSchemeDesc.TrBDF2()) I'd be very interest in seeing your benchmark suite running with the TrBDF2 scheme.
feribg commented 4 years ago

@klausspanderen Updated the code here to match your suggestions. Here's a more detailed notebook on Collab for analysis, feel free to update and pivot the data:

https://colab.research.google.com/drive/1ozqesYlsf_VWD8aMO9a3xOnIsGPupF4T

Did you have any feedback on getting the FD matrices out of QL rather than the solution point estimates (ideally in Python) or using fractional dates.

klausspanderen commented 4 years ago

@feribg At the time being only C++ users can access the internals. The internals will be exposed with the next release 1.18 to Python users (you can already build it from Luigi's branch.). W.r.t. fractional dates, you'll have to compile QuantLib with --enable-intraday and the SWIG layer yourself to get fractional dates.

Back to your data, thanks for providing your results. I've looked into trbdf_neg_gamma.csv. IMO the majority of negative values are coming from scenarios with very small dte (<5). Here the new scaling method

        N = min(1000, int(1000 * T))
        M = min(200, int(200 *T))

leads to grids with only a handful of points in x and t direction, which is too little. I'd suggest to have at least 200 in x and 20 in t direction, e.g.

        N = max(int(2000*T),  200)
        M = max(int(400*T),  20)

the remaining ~350 scenarios with gamma < -1e-4 are interesting. I've looked into a few examples and the spot values were already very close to the exercise boundary for small times, hence the value on the grid has a kink in the neighbourhood of the spot for small t, which is a problem for the spline interpolation. You can either go to large grids to overcome this problem or use bump and reval in this case, typical bump size is 1% of the spot value.

    def numeric_gamma(self, option, price, quote, h=0.01):
        spot = quote.value()
        quote.setValue(spot - h*spot)
        P_minus = option.NPV()
        quote.setValue(spot + h*spot)
        P_plus = option.NPV()
        quote.setValue(spot)
        return (P_plus + P_minus - 2*price) / (h*h*spot*spot)
feribg commented 4 years ago

Got it, thanks @klausspanderen I've updated the colab with the new finer grid:

        N = max(600, int(1000 * T))
        M = max(200, int(200 *T))

but the results are non conclusive I can't see one scheme being better than another. I've also included the non negative samples as well. Given that CN seems the fastest and the results seem pretty decent I assume the default is actually the best.

lballabio commented 4 years ago

It would be great if one of you (or both) wrote this up as documentation for the FdBlackScholesVanillaEngine class.

feribg commented 4 years ago

@lballabio This one? https://github.com/lballabio/QuantLib/blob/f918e1a34a542d528a55844610950413fb645983/ql/pricingengines/vanilla/fdblackscholesvanillaengine.hpp

I believe this is just setting parameters mostly the documentation is likely around the different schemes and solvers? I can document the implicit one since that's what im trying to replicate and speed up for American options.

lballabio commented 4 years ago

Yes, that would be the one. But you're right, it can be in some generic documentation for the framework. We can move it around later.

feribg commented 4 years ago

@lballabio Sure I will write something and submit a PR once we iron out some details. I've updated the code in the first post to include the BS price for the equivalent european option. Is theta that's output when available by the engine an absolute value? I don't think this is a good idea in general for example if we use an engine to price a single option vs a basket it would be very confusing, I think it's better to just return the correct true value without any transformations to it?

div = 0.089432
r = 0.096223
sigma = 1.151749
K = 5904.483856
put_call = 1.0
T = 711.0
S = 2200.588359
pricer.price_and_greeks(div,r,sigma,S,K,put_call, T)
FD NPV: 4252.408934951205,
FD delta: -0.40313953851438644,
 FD gamma: 0.00016221961096990784,
 FD theta: -105.83075529002323,
 FD vega: 11.214407480338195,
 BS NPV: 3630.6282605333213,
 BS delta -0.24721236611460154,
BS gamma  6.485864122680303e-05,
 BS theta 144.72424014909117,
 BS vega 10.267629837079248

@klausspanderen Why would an equivalent American option cost less than it's European counterpart ever? Here's a CSV ran through the pricer above that outputs substantially cheaper American options than the same parameter European ones (american premium column being negative). Is this an artefact of the FD scheme or there's another explanation for it ?

test.csv.zip

klausspanderen commented 4 years ago

More often than one might think the value of an American option is equal to the European option. In your comparison you have measured the difference between the analytic Black-Scholes formula and a numerical finite difference schema. The "negative American premium" is the small inaccuracy (avg ~3e-4, have you used the implicit schema?) of the numerical method due to finite grids. The true American premium for these option is zero. Use large grids to reduce the figures even more or remove all options from your example set, which won't take advantage from an early exercise.

feribg commented 4 years ago

Yep using the implicit with > 1000 sizes. I was digging deeper into how the FD code works an I had a few questions, hoping you could shed some light as there's 0 comments. @klausspanderen This is all for American Vanilla options

  1. Seems like for FDVanillaBlackScholes there's something like the Kim method being done by the boundary condition being estimated with an integral solver. I see a class called FDAvgCellValue that uses the Simpson method. Seems to borrow this idea https://demonstrations.wolfram.com/KimsMethodWithNonuniformTimeGridForPricingAmericanOptions/ and then runs bicgstab solver on top. Why do we do the boundary estimate to begin with when even the author says the solution of the integral likely outweighs all the benefits? Do you know any good reason of doing this vs something like a more traditional solver like Gauss Siedel approach?

  2. Is there any reason that interpolation is being used for American when the grid sizes need to be very fine anyways, does it not just slow you down for no real gain?

  3. In the Heston case the same condition matrix is calculated with this avg cell value method, but it's basically empty - only contains one value, which kind of makes sense as I'm not sure how would you estimate the boundary for Heston (so in this case is this condition estimator just sitting there for consistency purposes in Heston).

Just trying to wrap my head around it before writing some docs as it's not using the typical tridiagonal approach for the BS PDE as far as I can tell.

Thanks!

klausspanderen commented 4 years ago

Hi

  1. I assume you mean FdmCellAveragingInnerValue. This class is used for the so called "Cell Averaging" technique to ensure a smooth second order convergence of the spatial error with increasing grid sizes, see e.g. Karel in 't Hout very good book on PDE.
  2. Interpolation is only used at the end to get npv and greeks. It doesn't take long compared to the rest and used especially for delta and gamma.
  3. CellAveraging is a technique to smooth the final payoff and works the same for BS and Heston.
feribg commented 4 years ago

Thanks fo the info:

  1. Can you touch briefly on why this averaging is run on every single step and seems to solve an integral ? This sounds like a big overkill, unless Im missing something.

Cab you maybe point me in the right direction or correct my understanding below as to what hapens in what files:

  1. We generate the grid as per usual
  2. Convert to log space
  3. A condition_ variable is populated by this cell averaging function that solves some integral (not sure what goes into it and why)
  4. bicgstab solver is being called on the system (doesn't seem we use sparse matrices here) but more importantly I cant' see where in this iterative solver is the boundary condition being factored in, it seems like in step 3 a portion of the matrix is cut off and the solver only works on the rest (which might explain why it's much faster than my naive implementation).

I went through the code with the debugger but it's still very confusing because of all the abstractions.

klausspanderen commented 4 years ago

w.r.t. 1: Cell-averaging means to calculate the average of the final payoff at the first backward step for every grid cell. The code is generic and should work for all payoffs, hence the average is calculate via numerical integration. But this is only done once for the very first backward step and does not take long. Cell-averaging of the initial payoff is very effective in reducing numerical glitches, definitely worth the effort! Sure, you can make the implementation payoff dependent (e.g. call/put) and you can get ride of the numerical integration and replace it by the analytical solution.

w.r.t. 4: The bigstab algorithm is only used for higher dimensional problems, e.g. implicit scheme for the Heston model because here the matrix is no longer tri-diagonal. In the one dimensional BS case the matrix is tri-diagonal and the usual Thomson algorithm is used to solve the linear system. The cell averaging does not come into play here anymore (only used to map the option payoff at maturity onto the grid). Boundary conditions are implied in the definition of the matrix, here the so called free boundary condition.

feribg commented 4 years ago

wrt 1 - Thanks a lot that clarifies perfectly

wrt 4 - Im not clear how can you apply thomas for a path dependend option like American, I only see it applicable for European? But when I step through the code for BS american pricer it goes hit the Bicgstab solver, that's why im wondering how is the penalty defined, because the only reference I see in applying any of the shortcut solvers like BicGstab to american is by introducing a penalty term (ref. p 13 - https://cs.uwaterloo.ca/research/tr/2006/CS-2006-09.pdf ). So i was wondering where is that term introduced in the QL code I couldn't find it and I can't see another way of solving this problem, it either has to have the penalty or has to have the analytical boundary from Kim's method. Mind you my understanding of PDEs is limited.

Thanks for all the feedback.

klausspanderen commented 4 years ago

The BS solver does not use the BiCGSTAB algorithm. The BiCGSTAB algorithm is only used for higher dimensional problems within the implicit schema, e.g. the Heston model. Independent of the solver (Thomas or BiCSTAB), American options are priced using "dynamic programming" together with PDEs to solve the early exercise problem. The dynamic programming step is encoded in FdmAmericanStepCondition.

feribg commented 4 years ago

@klausspanderen Thanks a lot, it's all clear now, this explains why you suggested a relatively large timestep for American, under this scheme the errors seem to converge to an iterative solution for fine grids only.

nickk commented 3 years ago

My head exploded reading this thread (too much good knowledge)!