Underlying and Option Parameter Data Sources

Gordonei commented 9 years ago

Text I've removed from the specification document:

Underlyings

Ranges of the walk parameters should reflect typical underlyings observed in the market - a good approach would be to pick a whole bunch of actual time-series at different points in time (say 100 underlyings over 10 year-long periods to give 1000 underlyings, or 3000 for a full fit with all thre models), then do a maximum entropy fit. If this doesn't result in any "hard" parameters, then we'd want to revisit that.

Options

We plan to enhance the benchmark for more products later, e.g. to American / Bermudian options, products and models with relationships - products which depend on other products, models that are correlated (e.g. basket options, credit and forex swaps).

The parameters are again inspired by real-world option parameters, though in this case it is less clear how to choose them. There is a reasonable argument for choosing them equally spaced in some sense, rather than driven by traded volume or something, as we don't know what is important to different people.

It is crucial that the provided model parameters reflect realistic scenarios a) of day-to-day situations in different markets (e.g. liquid stock exchange, FOREX, …), but also critical corner cases that have shown to be hard to handle in the past. The data should therefore be legitimated carefully by one or more of the business partners (see below).

We need to ensure that prices end up in the "meaningful" range, so that we don't get prices in the range which could get rounded to zero, or cause problems with relative error. I don't know how to define that right now though :)

My two cents: We could potentially provide ranges of values for parameters, so that researchers could generate their own problems that are still "sensible". To simulate market pricing conditions, researchers could then also generate products on the fly from the ranges, and characterise how their systems cope with these sorts of problems. We could use this approach to generate the parameters we provide as part of the benchmark at least.

Gordonei commented 9 years ago

An approach that I've been using for some other work is to create options and underlyings with uniform random parameters that are bounded by the values used in the 1st benchmark.

I then use something like the procedure below to ensure that the values of the options are non-zero:

option_underlying_dictionary = {"black_scholes":{"barrier":833,"double barrier":833}} #etc.

seed = 1234
valid_options = []
for underlying_type in option_underlying_dictionary:
    for option_type in option_underlying_dictionary[underlying_type]:
       for i in range(option_underlying_dictionary[underlying_type][option_type]):
         value = 0
         while(value==0):
           option = generate_option(seed,option_type,underlying_type)
           value = solver(option)
           seed += 1
         valid_options += [option]

A more rigorous approach would maybe to ensure a distribution of values with respect to the tolerance bands defined?

Something that I'm not addressing here is reusing option and underlyings - we want 10k pairings, but only 3k options and 1k underlyings

cdeschryver commented 9 years ago

Hi Gordon, thanks for the example. We have just set up a distribution-based approach for works we are carrying out in the calibration domain. It generates Heston model parameters from distributions that we have tuned to markets. I will put an example here soon,

tukl-msd / finance.benchmark

Underlying and Option Parameter Data Sources #1

Underlyings

Options