Netflix-Skunkworks / riskquant

Apache License 2.0
611 stars 61 forks source link

Add subcategories to SimpleLoss #4

Open blackfist opened 4 years ago

blackfist commented 4 years ago

I would be interested in adding some functionality to the SimpleLoss class and I'm opening this issue to discuss my proposed functionality and discuss how best to approach it. What I would like to do is to be able to add different categories of loss to an instance of SimpleLoss. So it might look something like this

from riskquant import simpleloss
s = simpleloss.SimpleLoss("ALICE", "Alice steals the data", 0.10) # notice that no loss range provided
s.add_loss("Response", 75000, 150000)
s.add_loss("Replacement", 1000, 150000)
s.add_loss("Reputation", 100, 200)
s.add_loss("Competitive Advantage", 100, 200)
s.add_loss("Fines & Judgement", 50000,2000000)
s.add_loss("Productivity", 100, 20000)

s.annualized_loss() # returns the sum of the mean for each loss multiplied by avg frequency

I think a decent approach to this would be to abstract out the loss distribution into its own class. This would give people the ability to specify that some kinds of loss should be modeled with the lognormal and others could be modeled with pert.

from riskquant import simpleloss

s = simpleloss.SimpleLoss("ALICE", "Alice steals the data", 0.10)
s.add_loss("hard costs", 100000, 1000000) # defaults to lognormal loss
s.add_loss(PertLoss("soft costs", 1000, 5000, 150000, kurtosos=4))
snkilmartin commented 4 years ago

I think moving the loss distribution to a separate class is a good idea. If you get time, feel free to submit a PR. Otherwise, I'll add this to my backlog.

mdeshon commented 4 years ago

I like it, but could this maybe be a new class CompoundLoss that leverages the SimpleLoss class? I would like SimpleLoss to remain, well, simple, so that newcomers to the field will be able to have an entry point to quantifying their risks.

mdeshon commented 4 years ago

It seems like there could also be a better decomposition as you suggest where SimpleLoss contains a single loss distribution (by default a fixed frequency and lognormal magnitude) and CompoundLoss can contain multiple (potentially different) loss distributions.

philipdeboer commented 4 years ago

I tend to agree, we could have a SumLoss class. Have you made any progress on this?

What about generalizing the frequency distribution --- is there also interest modelling that side differently?

mdeshon commented 4 years ago

I did actually implement an abstract Loss() class that can contain any desired frequency distribution and magnitude distribution. Thinking about your CompoundLoss() suggestion, it seems like what you're asking for is the ability to add additional magnitudes given that the loss has occurred in a particular simulated year.

However (at least in the FAIR framework) isn't there also a frequency attached to each of the secondary risks? Or would it more appropriately be modeled as a probability that is conditional on the main loss frequency? In other words, in a particular simulated year, if the main loss occurs once, then for each secondary risk we would draw a random value and if it is <= than the attached probability, the secondary loss occurs and we draw from the magnitude.

The two alternatives are: use a frequency value (in which case a secondary risk could occur more than once for each main loss event) or don't use a probability or frequency, and just draw from the magnitude for each secondary risk.

Thoughts?

philipdeboer commented 4 years ago

Sometimes the Primary losses can be decomposed into several categories, such as technical investigation costs, customer notification costs (if we're talking data breach), and even litigation costs. If in your model all of these costs are guaranteed to occur in some amount for each incident, we can call them Primary costs. You may still want different distributions for each, if just for communication purposes. You'll estimate separate upper and lower bounds for each category.

A CorrelatedSumLoss class could combine these various Primary loss types with 100% correlation, and an UncorrelatedSumLoss class could combine them with no correlation.