Need to smooth the 1st derivative in order to calculate 2nd derivative

tyrneh commented 1 year ago

Numerical first derivative on implied call prices is noisy. Numerical differentiation on this raw data would lead to a noisy 2nd derivative.

Option 1: Research methods for numerical differentiation on noisy data: https://www.researchgate.net/publication/255907171_Numerical_Differentiation_of_Noisy_Nonsmooth_Data Python implementation here: https://oliver-k-ernst.medium.com/how-to-differentiate-noisy-signals-2baf71b8bb65

~~Option 2:~~ ~~Smooth the 1st derivative using a machine learning model~~ ~~https://stackoverflow.com/questions/15862066/gradient-in-noisy-data-python~~ ~~- this seems to be difficult. I think we'll need a natural cubic spline, because a normal cubic spline produces poor edge behavior that results in negative implied PDF after differentiating again~~

Update on Option 2:

smoothing the data before differentiating is not a good solution. "Denoising the data before or after differentiating does not generally give satisfactory results" (pg.1 of the paper above)

tyrneh commented 1 year ago

We'll use Total Variation Regularization to differentiate noisy data.

In TVR, we instead frame the derivative as an optimization problem
we apply a penalty term that penalizes irregularity in the resulting derivative

tyrneh commented 1 year ago

So the strategy is:

perform numerical differentiation on the raw data to get 1st derivative, as usual. This will result in a noisy 1st derivative
use TVR to differentiate the noisy 1st derivative, to hopefully get a well behaved 2nd derivative
We will probably want to fit another model onto the 2nd derivative, for example a natural cubic spline under some constraints. This will ensure that the pdf behaves 'correctly' (e.g. total area = 1, pdf >=0 for all domain)

Note that the results of TVR differentiation is a function of the alpha parameter, which controls the smoothness. The data for different assets will surely result in different amount of noise in their 1st derivative, so the alpha for one asset may not be applicable to another asset. However in practice, visually it seems like increasing alpha doesn't have any downsides (see pics). So perhaps we can just bump alpha up to a ridiculously high number.

Actually, alpha=100 is definitely too much as it noticeably shifted the pdf downwards. But alpha 0 -10 doesn't seem to affect the pdf, only smoothes out the jaggedness.

tyrneh commented 1 year ago

Last point on alpha:

we can let alpha be a user input in our final function, so the user can observe the (lack of) smoothness in their pdf and choose to increase alpha

tyrneh commented 6 months ago

Ignore everything above. We use spline to smooth derivatives, as options data contains noise. We use the method outlined by this paper: https://edoc.hu-berlin.de/bitstream/handle/18452/14708/zeng.pdf?sequence=1&isAllowed=y

jmholzer / probabilistic

Need to smooth the 1st derivative in order to calculate 2nd derivative #6