OpenSourceRisk / Engine

Open Source Risk Engine
http://www.opensourcerisk.org
Other
485 stars 216 forks source link

Question on covariance input #246

Closed rkapl123 closed 2 months ago

rkapl123 commented 2 months ago

When calculating a variance covariance matrix, I usually calculated the log of the returns on daily asset prices ( ln(Xn/Xn-1) .. Xn price of asset on day n; Xn-1 price of asset on day n-1). However with interest rates I just took the rates directly.

After reading the user guide on the covariance.csv input that is used vor VAR, I'm not sure how to compute this any more: "Covariances need to be consistent with the sensitivity data provided. For example, if sensitivity to factor1 is computed by absolute shifts and expressed in basis points, then the covariances with factor1 need to be based on absolute basis point shifts of factor1; if sensitivity is due to a relative factor1 shift of 1%, then covariances with factor1 need to be based on relative shifts expressed in percentages to, etc."

I'd like to follow the example and calculate the interest rate sensitivity based on absolute shift in basis points, like this

    <ShiftType>Absolute</ShiftType>
    <ShiftSize>0.0001</ShiftSize>
...

Do have to multiply interest rates (given as units) by 10000 and take their daily difference?

and the fx-rates and volatilities using relative shifts in percent, like this

    <FxSpot ccypair="USDEUR">
        <ShiftType>Relative</ShiftType>
        <ShiftSize>0.01</ShiftSize>
    </FxSpot>
...

Do have to get the relative return on these (Xn/Xn-1) - 1 ?

-regards, Roland

pcaspers commented 2 months ago

Yes for the interest rates. And for FX Spots you have to multiply the return by 100 because the shift size is 0.01.

rkapl123 commented 2 months ago

Great, many thanks for the quick help! So, to summarize this: 1) for absolute shifts: take the absolute difference of the observations and multiply by the inverse of the shift-size (distributive property of multiplication over subtraction) 2) for relative shifts: take the relative return of the observations and multiply by the inverse of the shift-size.

I'll post a python script that calculates this into https://github.com/OpenSourceRisk/Tools when I'm done (and if it is OK for you).

-regards, Roland

pcaspers commented 2 months ago

Sounds good, thanks Roland!

rkapl123 commented 2 months ago

Hi Peter! I've completed the script now, and cross checked the results with a separate variance calculation, however I'm still unsure about the combination of absolute and relative method (and differing shift sizes) in calculating covariance (between interest rates and FX for example). Is this mix of measures valid in the covariance?

Furthermore I was unsure about the level of variance/covariance, my (daily) variances range from ~30-50 with interest rates to ~0-0.35 with FX, whereas in example 17 there are levels of 2500 with interest rates and 100 with FX. Maybe this is due to "the holding period is incorporated into the input covariances." as the userguide states...

pcaspers commented 2 months ago

Hi Roland,

I think mixing relative and absolute returns for the purpose of covariance calculations is fine, the only thing that matters is that the returns are consistent with the sensi as discussed above.

30-50 daily variance would correspond to a daily standard deviation of roughly 5-7 basispoints which does not seem unreasonable? 2500 corresponds to 50 basispoints. I think this has to have to do with a longer holding period, although I am not sure where the data for this example comes from, it might be made up.

And 0.35 for FX corresponds to a daily standard deviation of 6%. Again not unreasonable I think?

rkapl123 commented 2 months ago

Dear Peter!

Thank you very much, I'll give it a try. Concerning the python script, is a pull request for the Tools repository the right thing to do? Or should I rather put this on my site?

-regards, Roland

pcaspers commented 2 months ago

Hi Roland, I would actually prefer the main repo, you could create a folder under Tools/ for this? The new Tools repo was meant for more remote stuff, we haven't really started to populate it (as you can see).

rkapl123 commented 2 months ago

OK, after the first error message and a closer look at the covariance file, I discovered that I was yet too optimistic about my endeavoring. I have to calculate the covariance between the curve points (discount tenors) and NOT the raw market data:

DiscountCurve/EUR/0/6M  DiscountCurve/EUR/0/6M  2500
DiscountCurve/EUR/1/1Y  DiscountCurve/EUR/1/1Y  2500
DiscountCurve/EUR/2/2Y  DiscountCurve/EUR/2/2Y  2500
DiscountCurve/EUR/3/3Y  DiscountCurve/EUR/3/3Y  2500
DiscountCurve/EUR/4/5Y  DiscountCurve/EUR/4/5Y  2500
DiscountCurve/EUR/5/7Y  DiscountCurve/EUR/5/7Y  2500
DiscountCurve/EUR/6/10Y DiscountCurve/EUR/6/10Y 2500
DiscountCurve/EUR/7/15Y DiscountCurve/EUR/7/15Y 2500
DiscountCurve/EUR/8/20Y DiscountCurve/EUR/8/20Y 2500
...

instead of

IR_SWAP/RATE/EUR/2D/6M/1Y   IR_SWAP/RATE/EUR/2D/6M/1Y   63.23
IR_SWAP/RATE/EUR/2D/6M/2Y   IR_SWAP/RATE/EUR/2D/6M/2Y   18.55
IR_SWAP/RATE/EUR/2D/6M/3Y   IR_SWAP/RATE/EUR/2D/6M/3Y   17.43
IR_SWAP/RATE/EUR/2D/6M/5Y   IR_SWAP/RATE/EUR/2D/6M/5Y   15.38
...

That is quite a bit harder and I have no clue. I think I have to calculate at least an intermediate zero-rate for each tenor, right? It's also touching my next problem how to get an input for the historical VAR simulation which also requires curves (discount factors) as input. Here my only idea is to use ORE iteratively and use the output of the curves report.

My tool (quick calculation of covariance in python) unfortunately is unusable in this context...

-regards, Roland

pcaspers commented 2 months ago

there is the scenario analytic that might help, do you remember Damien's talk in London, he demonstrated it

On Thu, 13 Jun 2024 at 16:52, Roland Kapl @.***> wrote:

OK, after the first error message and a closer look at the covariance file, I discovered that I was yet too optimistic about my endeavoring. I have to calculate the covariance between the curve points (discount tenors) and NOT the raw market data:

DiscountCurve/EUR/0/6M DiscountCurve/EUR/0/6M 2500 DiscountCurve/EUR/1/1Y DiscountCurve/EUR/1/1Y 2500 DiscountCurve/EUR/2/2Y DiscountCurve/EUR/2/2Y 2500 DiscountCurve/EUR/3/3Y DiscountCurve/EUR/3/3Y 2500 DiscountCurve/EUR/4/5Y DiscountCurve/EUR/4/5Y 2500 DiscountCurve/EUR/5/7Y DiscountCurve/EUR/5/7Y 2500 DiscountCurve/EUR/6/10Y DiscountCurve/EUR/6/10Y 2500 DiscountCurve/EUR/7/15Y DiscountCurve/EUR/7/15Y 2500 DiscountCurve/EUR/8/20Y DiscountCurve/EUR/8/20Y 2500 ...

instead of

IR_SWAP/RATE/EUR/2D/6M/1Y IR_SWAP/RATE/EUR/2D/6M/1Y 63.23 IR_SWAP/RATE/EUR/2D/6M/2Y IR_SWAP/RATE/EUR/2D/6M/2Y 18.55 IR_SWAP/RATE/EUR/2D/6M/3Y IR_SWAP/RATE/EUR/2D/6M/3Y 17.43 IR_SWAP/RATE/EUR/2D/6M/5Y IR_SWAP/RATE/EUR/2D/6M/5Y 15.38 ...

That is quite a bit harder and I have no clue. I think I have to calculate at least an intermediate zero-rate for each tenor, right? It's also touching my next problem how to get an input for the historical VAR simulation which also requires curves (discount factors) as input. Here my only idea is to use ORE iteratively and use the output of the curves report.

My tool (quick calculation of covariance in python) unfortunately is unusable in this context...

-regards, Roland

— Reply to this email directly, view it on GitHub https://github.com/OpenSourceRisk/Engine/issues/246#issuecomment-2165906885, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAT4HLPF643HWLG3YKL2B4DZHGWZ3AVCNFSM6AAAAABJCS7E4KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRVHEYDMOBYGU . You are receiving this because you commented.Message ID: @.***>

rkapl123 commented 2 months ago

I already had this in mind, there is a output file called scenario.config, the userguide mentions three files, but stops before explaining the third one: image

The file has a column BaseValue that looks useful. I could run a scenario analysis for each of the historic dates and store this information separately, using it 1) as input for a covariance calculation and 2) as the historic scenarios themselves for the historic VAR... The problem is only that the full sensitivity analytics takes really long and I only need to have the base value. Is there a switch to just get this?

You mean Damien's talk in London last year? I haven't attended the event this year...

-regards, Roland

pcaspers commented 2 months ago

Ah sorry, you haven't. There is a new scenario analytic that allows to generate a base scenario based on simulation.xml (i.e. zero rates). That's exactly what you need to calculate the covariance matrix. I'll ask Damien to forward some info on this.

rkapl123 commented 2 months ago

That's very helpful, I think it's in example 57:

    <Analytic type="scenario">
      <Parameter name="active">Y</Parameter>
      <Parameter name="simulationConfigFile">simulation.xml</Parameter>
      <Parameter name="scenarioOutputFile">scenario.csv</Parameter>
    </Analytic>

If that's quicker then it is the solution (I hope)! I'm curious whether Damien already has built some script around that...

-regards, Roland

pcaspers commented 2 months ago

that's the one!

On Thu, 13 Jun 2024 at 17:24, Roland Kapl @.***> wrote:

That's very helpful, I think it's in example 57:

<Analytic type="scenario">
  <Parameter name="active">Y</Parameter>
  <Parameter name="simulationConfigFile">simulation.xml</Parameter>
  <Parameter name="scenarioOutputFile">scenario.csv</Parameter>
</Analytic>

If that's quicker then it is the solution (I hope)!

-regards, Roland

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

rkapl123 commented 2 months ago

Hi Peter!

I guess, I've found Damiens presentation for the talk: https://github.com/OpenSourceRisk/ORE-SWIG/blob/master/OREAnalytics-SWIG/Python/Examples/Notebooks/Example_8/analytics.ipynb

However, it doesn't mention that it builds the historic simulation input, I guess that's left to implement still? And concerning the usage of this to build the covariance matrix, I'd still need to convert the discount factors to zeros, right?

-regards, Roland

pcaspers commented 2 months ago

Yes regarding the discount -> zero rate conversion.

I don't quite understand your other question?

rkapl123 commented 2 months ago

Sorry, I've misunderstood the purpose of Damiens presentation, I thought it already shows the creation of input for historic simulation. I've managed to build this however now (after installing the latest ORE SWIG version directly from the github actions...), I assume a collection of market data by date in Input/histmarketdata.txt):

df = pd.read_csv("Input/histmarketdata.txt",sep='\t')
df["wholeLine"] = df["Date"].astype(str)+"\t"+df["Name"]+"\t"+df["Value"].astype(str)
df["Date"] = pd.to_datetime(df["Date"],format="%Y%m%d")

# set up ORE
inputs = ore.InputParameters()
inputs.setResultsPath(".")
inputs.setAllFixings(True)
inputs.setEntireMarket(True)
inputs.setCurveConfigs(decodeXML("Input/curveconfig.xml"))
inputs.setConventions(decodeXML("Input/conventions.xml"))
inputs.setPricingEngine(decodeXML("Input/pricingengine.xml"))
inputs.setTodaysMarketParams(decodeXML("Input/todaysmarket.xml"))
inputs.insertAnalytic("SCENARIO")
inputs.setScenarioSimMarketParams(decodeXML("Input/simulation.xml"))
with open("Input/Fixingdata.txt") as f:
    fixingsdata = ore.StrVector(f.read().splitlines())

file=open('scenarios.csv', 'w')
headerRow = ""
for scenDate in df["Date"].unique():
    # get marketdata block from history
    marketdata = df[df["Date"] == scenDate]["wholeLine"].tolist()
    inputs.setAsOfDate(scenDate.strftime("%Y-%m-%d"))
    oreapp = ore.OREApp(inputs, "log.txt", 63, True)
    oreapp.run(marketdata,fixingsdata)

    if not check(oreapp):
        os._exit(1)
    report = oreapp.getReport("scenario")

    # create headerRow only once
    if headerRow == "":
        for i in range(report.columns()):
            headerRow += report.header(i)+("\t" if i<report.columns()-1 else "")
        file.write(headerRow + "\n")

    dataRow = report.dataAsString(0)[0]+"\t"+str(report.dataAsSize(1)[0])+"\t"
    for i in range(2,report.columns()):
        dataRow += str(report.dataAsReal(i)[0])+("\t" if i<report.columns()-1 else "")
    file.write(dataRow + "\n")

Next I'm going to convert the discount factors to zeros and continue from where I left off..

-regards, Roland

rkapl123 commented 2 months ago

Hi Peter!

I've also done the conversion of discount factors now, however for easier computing, I've resorted to the market object of the scenario analytic (instead of converting the output DFs):

        # ... continuing the loop from above.
        # crcies, indices, tenorsYrs and basecrcy are taken from parsing simulation.xml
    # get zeros and fxspot rates directly from the market and fill it into dfzero
    todaysmarket = oreapp.getAnalytic("SCENARIO").getMarket()
    for crcy in crcies:
        crv = None
        try:
            crv = todaysmarket.discountCurve(crcy)
        except Exception:
            pass
        if not crv == None:
            for idx, tenor in enumerate(tenorsYrs):
                colname = "DiscountCurve/" + crcy + "/" + str(idx+1)
                dfzero.at[scenDate,colname] = crv.zeroRate(tenor,ore.Compounded).rate()
                print(crv.discount(tenor))
        if crcy != basecrcy:
            colname = "FXSpot/" + crcy + basecrcy + "/0"
            dfzero.at[scenDate,colname] = todaysmarket.fxSpot(crcy + basecrcy).value()
    for index in indices:
        fwd = None
        try:
            ind = todaysmarket.iborIndex(index)
            fwd = ind.forwardingTermStructure()
        except Exception:
            pass
        if not fwd == None:
            for tenor in tenorsYrs:
                colname = "IndexCurve/" + index + "/" + str(idx+1)
                dfzero.at[scenDate,colname]  = fwd.zeroRate(tenor,ore.Compounded).rate()
                print(fwd.discount(tenor))

However, I'm not convinced this is very good, as the discount factors that I printed for comparison are not exactly matching the output of the scenario analytics (same only up to the 4th decimal). Also I think the sub-year zeros should have a different compounding (Simple instead of Compounded), or should I use Continuous instead?

-regards, Roland

pcaspers commented 2 months ago

Hi Roland,

yes the zero rates should preferably taken from the scenario output to match the maturity time, and the conversion should be to continuously compounded zero rates.

Best Peter

Roland Kapl @.***> schrieb am Mi. 19. Juni 2024 um 00:17:

Hi Peter!

I've also done the conversion of discount factors now, however for easier computing, I've resorted to the market object of the scenario analytic (instead of converting the output DFs):

    # ... continuing the loop from above.
    # crcies, indices, tenorsYrs and basecrcy are taken from parsing simulation.xml

get zeros and fxspot rates directly from the market and fill it into dfzero

todaysmarket = oreapp.getAnalytic("SCENARIO").getMarket() for crcy in crcies: crv = None try: crv = todaysmarket.discountCurve(crcy) except Exception: pass if not crv == None: for idx, tenor in enumerate(tenorsYrs): colname = "DiscountCurve/" + crcy + "/" + str(idx+1) dfzero.at[scenDate,colname] = crv.zeroRate(tenor,ore.Compounded).rate() print(crv.discount(tenor)) if crcy != basecrcy: colname = "FXSpot/" + crcy + basecrcy + "/0" dfzero.at[scenDate,colname] = todaysmarket.fxSpot(crcy + basecrcy).value() for index in indices: fwd = None try: ind = todaysmarket.iborIndex(index) fwd = ind.forwardingTermStructure() except Exception: pass if not fwd == None: for tenor in tenorsYrs: colname = "IndexCurve/" + index + "/" + str(idx+1) dfzero.at[scenDate,colname] = fwd.zeroRate(tenor,ore.Compounded).rate() print(fwd.discount(tenor))

However, I'm not convinced this is very good, as the discount factors that I printed for comparison are not exactly matching the output of the scenario analytics (same only up to the 4th decimal). Also I think the sub-year zeros should have a different compounding (Simple instead of Compounded), or should I use Continuous instead?

-regards, Roland

— Reply to this email directly, view it on GitHub https://github.com/OpenSourceRisk/Engine/issues/246#issuecomment-2177180054, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAT4HLPHP6SBXBN3XYOGXITZICWWBAVCNFSM6AAAAABJCS7E4KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNZXGE4DAMBVGQ . You are receiving this because you commented.Message ID: @.***>

rkapl123 commented 2 months ago

Dear Peter!

I've finished now the script VarInput.py and created a pull request in Tools. I hope it's also useful for others.

-regards, Roland

pcaspers commented 2 months ago

Thank you, Roland, that's great!

rkapl123 commented 2 months ago

Dear Peter, I'd also like to thank you for your patient help in this.

For running the script, the (really) latest version of ORE-Swig is necessary, I only managed to install it by downloading the wheels-windows-AMD64 from the ORE-Swig actions directly. Maybe after the maintenance release of ORE, there could be a new deployment of ORE-Swig to pypi?

pcaspers commented 2 months ago

Yes, we are in the process of building + testing the updated wheels. Once this is done, they will be available on pypi. This might take a couple of days. Thanks again!

rkapl123 commented 2 months ago

OK, then I think this is finished!

-regards, Roland