OasisLMF / ReinsuranceTestTool

Test tool for new reinsurance functionality.
BSD 3-Clause "New" or "Revised" License
3 stars 1 forks source link

Bring required fields in code inline with OED spec #73

Closed johcarter closed 5 years ago

johcarter commented 5 years ago

I am going to upgrade the 40+ example oed files since the latest code changes ref #69 to a) pass the tests since all are currently failing and b) to be valid OED files as per the spec. In order that i only have to do this once, could we please enhance the routine in the fm file generation code to handle mismatches in required fields between our code and the OED spec.

For example, AccDed6All is a required field in the code but not in the spec, so my valid OED files fail.

The OED spec v1.0.2 required fields are ;

account

PortNumber AccNumber AccCurrency PolNumber PolPerilsCovered

location

PortNumber AccNumber LocNumber CountryCode LocPerilsCovered BuildingTIV OtherTIV ContentsTIV BITIV LocCurrency

reins info

ReinsNumber ReinsPeril PlacedPercent ReinsCurrency InuringPriority ReinsType

reins scope

ReinsNumber RiskLevel

I'm not saying do a full validation piece of work at this stage but can we please set sensible defaults for code-required spec-optional fields so that i can get my valid oed files passing the tests in time for the next release.

sambles commented 5 years ago

Tests update in the dev branch to match whats in oasislmf 1.2.3

This does solve the issue of have reinsurance examples in a single location, but it should allow you to continue with this for the moment.

sr-murthy commented 5 years ago

@johcarter This issue is caused by hard coded references to OED column names in the OED module (oasislmf/exposures/oed.py) and possibly also the reinsurance module (oasislmf/exposures/reinsurance_layer.py), which were implemented by @mpinkerton-oasis et. al. Here is the hard coded reference to AccDed6All

https://github.com/OasisLMF/OasisLMF/blob/develop/oasislmf/exposures/oed.py#L466

There are other lists of OED column names being used in the OED module, which may explain your failing test cases. So this makes the modules break very easily if the columns change in any way (change in column names, adding or removing columns), which is one of the points I was trying to make yesterday in my talk - the core parts of the MDK, e.g. the exposure manager and FM utils, which I've implemented, use profiles to process source data and don't contain any hard coded references.

All of the FM acceptance tests I've implemented (FM3-7 and FM40) pass, but these are non-RI test cases and don't make any calls to the unstable code in the reinsurance and OED modules.

So I'll unassign myself - it would be best to ask @mpinkerton-oasis et. al. about this.

sr-murthy commented 5 years ago

Note: I am sure the non-RI part of the MDK is fully compatible with the OED spec., because the OED data is processed via profiles, not directly, and the profiles are consistent with the spec.

johcarter commented 5 years ago

You are right. But best to address this after all the code is integrated.

@sr-murthy you have previously sent a list of some required fields in the account and loc file that are also optional in OED spec:

Specifically, the source OED loc. file must contain the following columns:

LocNumber,AccNumber,LocNumber,LocName,AreaCode,CountryCode,LocPeril,BuildingTIV,LocDed1Building,LocLimit1Building,OtherTIV,LocDed2Other,LocLimit2Other,ContentsTIV,LocDed3Contents,LocLimit3Contents,BITIV,LocDed4BI,LocLimit4BI,LocDed5PD,LocLimit5PD,LocDed6All,LocLimit6All

and the source OED acc. file must contain the following columns:

LocNumber,AccNumber,PortNumber,PolNumber,PolPeril,CondDed6All,CondLimit6All,PolDed6All,PolMinDed6All,PolMaxDed6All,LayerAttachment,LayerLimit,LayerParticipation


Is it actually the case that it is the generated canonical acc and loc files that need these fields (not the source) because we are still using the xslt transform from source to canonical? Therefore i could omit them from the source files and use the xslts to handle missing fields?

sr-murthy commented 5 years ago

@johcarter Yes, that is correct, those sets of OED fields I listed are required to be present in the corresponding source file because the OED profiles, which are static metadata used by the MDK, define them. You cannot have a situation where the MDK uses static canonical profiles to process the source data and the source data sometimes having the fields defined in those profiles and at other times missing those fields.

Do you see my point? It is really important to understand that to the MDK the profiles are static resources, and they are loaded from JSON into dicts, and used to process the source data. So all the fields defined in the profiles MUST be present in the source data. This is perfectly logical.

sr-murthy commented 5 years ago

When you say the fields are optional, this is incorrect - what you mean to say is that the values of those columns are 0 or null. If you are not using a set of columns for a particular run or a test case, and the columns are in the set of columns referred to in the canonical profiles, then those columns must still be present in the source files, but their values can be set to 0 or null.

johcarter commented 5 years ago

i mean OED spec optional, not MDK optional. there are fields in the MDK profiles which are required but they are not required in the input files if no financial terms exist. this is the issue

johcarter commented 5 years ago

does the MDK invoke the xslt to produce a canonical version of the source file (as in flamingo) or does it bypass this stage and read the source file directly?

sr-murthy commented 5 years ago

OK, so @johcarter at the moment, by design, the MDK expects all columns/fields referred to in the canonical profiles and the FM aggregation profiles to be present in the source files. That is how it works because the profiles drive the Oasis files generation.

I guess if no financial terms are present in the source files you can add some steps in the MDK to check whether the expected profile terms are present in the source files, and if not then proceed in a different way. All the development and testing of the MDK was done on the basis of source files which contained at least some financial terms, and these terms were a subset of the terms defined in the profiles. If this needs to change, then we can certainly amend the MDK appropriately. But I would need to have a discussion with you about how this is going to work.

johcarter commented 5 years ago

@smurthy thanks for that info but i still havent got the answer to my question. The xlst is an input to the MDK right? so I assume that it is used to generate the canonical version of source loc and source acc? Are these files used at all?

sr-murthy commented 5 years ago

Yes, at the moment we are still doing the source -> canonical transformation, and then the canonical -> model.

sr-murthy commented 5 years ago

We can get rid of that if it is no longer required.

sr-murthy commented 5 years ago

@johcarter I don't understand how you can be missing the coverage columns in the source files, e.g. BuildingTIV is the buildings TIV column, and it is described in the loc. profile by this dict

    "BuildingTIV": {
    "ProfileElementName": "BuildingTIV",
    "FieldName": "TIV",
    "PerilID": 1,
    "CoverageTypeID": 1,
    "FMLevelName": "SiteCoverage",
    "FMLevel": 1,
    "FMTermType": "TIV",
    "FMTermGroupID": 1,
    "ProfileType": "Loc"
}

This column, like the other coverage terms, is categorised as an FM term for technical reasons.

I don't understand why the coverage terms would be missing from the source loc. file. You cannot generate GUL input files without the coverage terms.

sr-murthy commented 5 years ago

Also, I'm not sure I follow your point about those columns you listed being optional in the OED spec. All of them are defined in the spec. How can they be optional, and how is the MDK supposed to know which terms are "optional" and which are required, in different runs/test cases, especially when you're using profiles which define those terms?

johcarter commented 5 years ago

See the top of this issue for the list of required fields. All others are optional in OED.

While we still use xslts these should act as a validation on the fields in source loc and acc. If any fields are missing (an optional field such as LocDed1Building for example) then the xslt should include the field with an appropriate default value. If the MDK used the canonical loc and acc format as its data input (not source loc and acc), then it will always find the fields it requires.

I looked at the xslt for PiWind with Ben on wednesday, and it appears to be doing this. So i think what needs to happen for the time being is that MDK needs to accept source loc and acc, invoke the xslt and then take the financial fields from the canonical loc and acc, not source.

When the xslts are later phased out, the validation currently provided by the xlsts will need to be replaced.

sr-murthy commented 5 years ago

@johcarter Yes, but the MDK does process financial terms from the canonical files, not the source files. But the assumption is that the file transformations, which happen in the MDK and also in Flamingo, will preserve the financial terms in canonical files originally present in the source files.

The financial terms defined in the static canonical profiles that the MDK uses, including the coverage level terms, define a superset of terms/columns assumed to be present in the canonical files, not the source files.

So, I'm sorry this caused any confusion.

sr-murthy commented 5 years ago

That is why the canonical profiles are called canonical profiles - they represent a high level description of the financial terms present in the canonical data files (loc. and acc.), including the coverage level terms. As the MDK currently transforms the source files to canonical form the format of the canonical files can depend on that transformation.

johcarter commented 5 years ago

The xslt both preserves the financial fields present in the source, and sets appropriate defaults for fields not present, so that all fields required by MDK are present and valid in the canonical file if the xslt is correct.

sr-murthy commented 5 years ago

OK, in that case I don't see a problem, as long as the profiles the MDK uses are correct.

mpinkerton-oasis commented 5 years ago

@sr-murthy we have started implementing some OED validation in this module: https://github.com/OasisLMF/OasisLMF/blob/develop/oasislmf/exposures/oed.py

The validation, as defined in the spec, is somewhat complex as there are required fields, defaults and also some interaction between the fields in terms of dependencies.

Note that Simplitium are working on a JSON-SCHEMA that will be maintained as part of the core OED. This will capture all of the complex validation rules, so I think we should consider how we can use this directly.

sr-murthy commented 5 years ago

OK, but I think a detailed in-person discussion is required. If we are going to work directly with OED source files we can certainly remove the file transformations that happen in the MDK, so that we don't need to have canonical files or model loc. files. Obviously this would represent a new requirement. But it would still be best for the MDK to use a high level description of a fixed set of financial terms in the source files, including the coverage level terms, in the form of JSON profiles.

Validation is separate from the processing of the coverage level and financial terms in the MDK, which is my concern.

I think a meeting is best.

mpinkerton-oasis commented 5 years ago

Agree that validation is a completely seperate step. We should validate the OED files and then do any Oasis file generation subsequently on valid files.

sr-murthy commented 5 years ago

OK, I have no problem with that. So you are proposing to replace the file transformations with these validity checks?

The only problem I see with this OED validation module [?] is that it contains hardcoded references to column names in the OED source files (loc., acc. RI), e.g. AccDed6All, which according to @johcarter this has caused some of her tests to fail. Obviously, this is completely unrelated to the Oasis files generation process in the MDK, where canonical files are the starting point. I was just trying to establish whether this was the case.

johcarter commented 5 years ago

There is a long term goal of replacing xslts and introducing a proper file validation layer, and the short term goal to get valid OED files (with these optional fields) through the test framework for the next release. Lets focus on the short term goal this month.