DITAS-Project / VDC-Request-Monitor

Reverse Proxy for VDC Traffic. Used to transparently gather logging information.
Apache License 2.0
0 stars 1 forks source link

Add Method Filtering #10

Closed tawalaya closed 5 years ago

tawalaya commented 5 years ago

Add method filitering based on the present blueprint. Example, in the following blueprint only GetHistoricalData is present in the DATA_MANAGEMENT section. Therefore only requests like GET /GetHistoricalData will work. All other request will return a 403 response.

Example Blueprint (some fields omitted for space reasons):

{
    "INTERNAL_STRUCTURE": {
        "Overview": {
            "name": "DITAS Machine 3 - Soraluce FS",
            "description": "VDC blueprint for the 'DITAS Machines 3 - Soraluce FS' at Ideko",
            "tags": []
        },
        "Data_Sources": [
        ],
        "resourcesAvailable":[],
        "Testing_Output_Data": [
            {
                "method_id": "GetHistoricalData",
                "zip_data": "TO-DO URL to a zip file with data samples"
            },
            {
                "method_id": "GetStreamingData",
                "zip_data": "TO-DO URL to a zip file with data samples"
            },
            {
                "method_id": "GetIndicatorsData",
                "zip_data": "TO-DO URL to a zip file with data samples"
            }
        ]
    },
    "DATA_MANAGEMENT":
    [
        {
            "method_id": "GetHistoricalData",
            "attributes": {
                "dataUtility": [{
                    "id": "availability_90",
                    "description": "Availability 90",
                    "type": "Availability",
                    "properties": {
                        "availability": {
                            "unit": "percentage",
                            "minimum": 90
                        }
                    }
                }, {
                    "id": "processCompleteness_80",
                    "description": "Process completeness 80",
                    "type": "Process completeness",
                    "properties": {
                        "completeness": {
                            "minimum": 80,
                            "unit": "percentage"
                        }
                    }
                }, {
                    "id": "throughput_18",
                    "description": "Throughput 1.8",
                    "type": "Throughput",
                    "properties": {
                        "throughput": {
                            "minimum": 1.8,
                            "unit": "MB/s"
                        }
                    }
                }, {
                    "id": "precision_08",
                    "description": "Precision 0.8",
                    "type": "Precision",
                    "properties": {
                        "precision": {
                            "minimum": 0.8,
                            "unit": "none"
                        }
                    }
                }, {
                    "id": "accuracy_09",
                    "description": "Accuracy 0.9",
                    "type": "Accuracy",
                    "properties": {
                        "accuracy": {
                            "minimum": 0.9,
                            "unit": "none"
                        }
                    }
                }],
                "security": [],
                "privacy": []
            }
        }
    ],
    "ABSTRACT_PROPERTIES": [],
    "COOKBOOK_APPENDIX": {},
    "EXPOSED_API": {
        "openapi":"3.0.0",
        "info":{
            "title":"IDEKO VDC",
            "description":"VDC methods for the IDEKO use case",
            "version":"0.0.1"
        },
        "paths":{
            "/GetHistoricalData":{
                "get":{
                    "summary":"Returns the corresponding data values for to the given parameters.",
                    "operationId":"GetHistoricalData",
                    "parameters":[
                        {
                            "in":"query",
                            "name":"location",
                            "required":true,
                            "schema":{
                                "type":"string"
                            }
                        },
                        {
                            "in":"query",
                            "name":"machine",
                            "required":true,
                            "schema":{
                                "type":"string"
                            }
                        },
                        {
                            "in":"query",
                            "name":"group",
                            "required":false,
                            "schema":{
                                "type":"string"
                            }
                        },
                        {
                            "in":"query",
                            "name":"indicator",
                            "required":false,
                            "schema":{
                                "type":"string"
                            }
                        },
                        {
                            "in":"query",
                            "name":"from",
                            "required":true,
                            "schema":{
                                "type":"number"
                            }
                        },
                        {
                            "in":"query",
                            "name":"to",
                            "required":true,
                            "schema":{
                                "type":"number"
                            }
                        }
                    ],
                    "responses":{
                        "200":{
                            "description":"OK",
                            "content":{
                                "application/json":{
                                    "schema":{
                                        "$ref":"#/components/schemas/GetHistoricalDataResponse"
                                    }
                                },
                                "application/octet-stream":{
                                    "schema":{
                                        "type":"string",
                                        "format":"binary"
                                    }
                                }
                            }
                        },
                        "default":{
                            "description":"Unexpected error",
                            "content":{
                                "application/json":{
                                    "schema":{
                                        "$ref":"#/components/schemas/ErrorResponse"
                                    }
                                }
                            }
                        }
                    },
                    "x-data-sources":[
                        "SavvyCloudAPI"
                    ]
                }
            },
            "/GetStreamingData":{
                "get":{
                    "summary":"Returns the streaming data for the parameters given.",
                    "operationId":"GetStreamingData",
                    "parameters":[
                        {
                            "in":"query",
                            "name":"machines",
                            "required":true,
                            "schema":{
                                "type":"string"
                            }
                        }
                    ],
                    "responses":{
                        "200":{
                            "description":"OK",
                            "content":{
                                "application/json":{
                                    "schema":{
                                        "$ref":"#/components/schemas/GetStreamingDataResponse"
                                    }
                                },
                                "application/octet-stream":{
                                    "schema":{
                                        "type":"string",
                                        "format":"binary"
                                    }
                                }
                            }
                        },
                        "default":{
                            "description":"Unexpected error",
                            "content":{
                                "application/json":{
                                    "schema":{
                                        "$ref":"#/components/schemas/ErrorResponse"
                                    }
                                }
                            }
                        }
                    },
                    "x-data-sources":[
                        "SavvyCloudAPI"
                    ]
                }
            },
            "/GetIndicatorsData":{
                "get":{
                    "summary":"Returns the data and the human names for the given indicatorIdList of the given date range. The from and to values must be timestamps in milliseconds. Indicators values are gathered from the InfluxDB deployed in the smart-box. The human names are gathered from the in-box API.",
                    "operationId":"GetIndicatorsData",
                    "parameters":[
                        {
                            "in":"query",
                            "name":"indicators",
                            "required":true,
                            "schema":{
                                "type":"string"
                            }
                        },
                        {
                            "in":"query",
                            "name":"from",
                            "required":true,
                            "schema":{
                                "type":"number"
                            }
                        },
                        {
                            "in":"query",
                            "name":"to",
                            "required":true,
                            "schema":{
                                "type":"number"
                            }
                        }
                    ],
                    "responses":{
                        "200":{
                            "description":"OK",
                            "content":{
                                "application/json":{
                                    "schema":{
                                        "$ref":"#/components/schemas/GetIndicatorsDataResponse"
                                    }
                                },
                                "application/octet-stream":{
                                    "schema":{
                                        "type":"string",
                                        "format":"binary"
                                    }
                                }
                            }
                        },
                        "default":{
                            "description":"Unexpected error",
                            "content":{
                                "application/json":{
                                    "schema":{
                                        "$ref":"#/components/schemas/ErrorResponse"
                                    }
                                }
                            }
                        }
                    },
                    "x-data-sources":[
                        "SavvyCloudAPI"
                    ]
                }
            }
        },
        "components":{
            "schemas":{
                "ErrorResponse":{
                    "type":"object",
                    "properties":{
                        "status":{
                            "type":"integer"
                        },
                        "code":{
                            "type":"integer"
                        },
                        "message":{
                            "type":"string"
                        },
                        "link":{
                            "type":"string"
                        },
                        "developerMessage":{
                            "type":"string"
                        }
                    }
                },
                "GetHistoricalDataResponse":{
                    "type":"array",
                    "items":{
                        "type":"object"
                    }
                },
                "GetStreamingDataResponse":{
                    "type":"object",
                    "properties":{
                        "machine":{
                            "type":"string"
                        },
                        "group":{
                            "type":"string"
                        },
                        "data":{
                            "type":"array",
                            "items":{
                                "type":"object",
                                "properties":{
                                    "additionalProperties":{
                                        "type":"integer"
                                    },
                                    "timestamp":{
                                        "type":"object"
                                    }
                                }
                            }
                        }
                    }
                },
                "GetIndicatorsDataResponse":{
                    "type":"object",
                    "properties":{
                        "indicatorId":{
                            "type":"string"
                        },
                        "indicatorName":{
                            "type":"string"
                        },
                        "data":{
                            "type":"array",
                            "items":{
                                "type":"object",
                                "properties":{
                                    "timestamp":{
                                        "type":"string"
                                    },
                                    "value":{
                                        "oneOf":[
                                            {
                                                "type":"number"
                                            },
                                            {
                                                "type":"string"
                                            }
                                        ]
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
tawalaya commented 5 years ago

Please @AitorF @jose-sanchezm @achilleasmarinakis review this and let me know if that's correct. Assign me to it if everything is fine.

achilleasmarinakis commented 5 years ago

Hi @tawalaya , all the exposed VDC methods are defined in the section "EXPOSED_API" of the blueprint and are uniquely identified with the "operationId" (e.g. GetHistoricalData). Thid id is also used as "method_id" to uniquely identify the methods in other sections (e.g. INTERNAL_STRUCTURE, DATA_MANAGEMENT) of the blueprint. So, if a method is not selected by the application designer, this means that it will be deleted from all the sections that it appears and yes the response to a call to that method should be 403. The blueprint (intermediate blueprint to be precise) that contains only the selected method(s) would be the final outcome of the whole Resolution process and thus is forwarded to the Deployment Engine. So it will be available to all the components of a running VDC such as the VDC-Request-Monitor.

tawalaya commented 5 years ago

Hi @tawalaya , all the exposed VDC methods are defined in the section "EXPOSED_API" of the blueprint and are uniquely identified with the "operationId" (e.g. GetHistoricalData). Thid id is also used as "method_id" to uniquely identify the methods in other sections (e.g. INTERNAL_STRUCTURE, DATA_MANAGEMENT) of the blueprint. So, if a method is not selected by the application designer, this means that it will be deleted from all the sections that it appears and yes the response to a call to that method should be 403. The blueprint (intermediate blueprint to be precise) that contains only the selected method(s) would be the final outcome of the whole Resolution process and thus is forwarded to the Deployment Engine. So it will be available to all the components of a running VDC such as the VDC-Request-Monitor.

Are the methods in EXPOSED_API also removed if they are not available or will this remain unchanged?

vmoulos commented 5 years ago

Hi @tawalaya , all the exposed VDC methods are defined in the section "EXPOSED_API" of the blueprint and are uniquely identified with the "operationId" (e.g. GetHistoricalData). Thid id is also used as "method_id" to uniquely identify the methods in other sections (e.g. INTERNAL_STRUCTURE, DATA_MANAGEMENT) of the blueprint. So, if a method is not selected by the application designer, this means that it will be deleted from all the sections that it appears and yes the response to a call to that method should be 403. The blueprint (intermediate blueprint to be precise) that contains only the selected method(s) would be the final outcome of the whole Resolution process and thus is forwarded to the Deployment Engine. So it will be available to all the components of a running VDC such as the VDC-Request-Monitor.

The proposed solution will raise issues a) regarding the method filtering that POLIMI would like to introduce b) will change the intermediate blueprint c) will change a bit the resolution process d) and destroys the idea that the blueprint could have an orchestration tool that could manage the services and the containers. Having a firewall-like-approach minimizes the functionalities of the blueprint and creates a new effort to restructure our services. @tawalaya @achilleasmarinakis @jose-sanchezm @AitorF it is preferable to solve that issue in the deployment phase and not alter the services flow in wp3 phase. I think if we continue with @AitorF and @jose-sanchezm plan to ignore the flow from the blueprint we should accept the fact that the method elimination will not work. It is complex to have extra interface to select methods and introduce a new file that will send to the deployment engine and after that to monitoring component in order to block the methods. The goal was to lightweight the blueprint and not to block the methods which is unnecessary. On top of that, we will have extra overhead to a complicated architecture.

tawalaya commented 5 years ago

I am fine with or without that feature. Regarding your points @vrettos

I think if we continue with @AitorF and @jose-sanchezm plan to ignore the flow from the blueprint we should accept the fact that the method elimination will not work.

This can only (if at all) be done for node red, a spar based vdc couldn't possibly be modified at deployment, unless the developer of that code integrated options for that.

So if we are not using the firewall method, we might need to introduce a data owner guideline/interface that helps them to create different versions of a vdc.

vmoulos commented 5 years ago

The point of the flow in the Abstract Blueprint is exactly to support multiple orchestration platforms like (Node-RED, spring cloud, Netflix conductor or even Apache Camel ). IBM had also a plan to explain how that will work with Spark. So it is done in a manner that could support extra tools. You can realize that it is out of projects scope to implement solutions-examples for every tool. An example from one or two use cases is more than enough. If someone wants to use DITAS using alternative tool then it is free and we are happy to implement it (we did our scope which is to provide the schema that could host all the necessary functionalities).

I am fine with or without that feature. Regarding your points @vrettos

I think if we continue with @AitorF and @jose-sanchezm plan to ignore the flow from the blueprint we should accept the fact that the method elimination will not work.

This can only (if at all) be done for node red, a spar based vdc couldn't possibly be modified at deployment, unless the developer of that code integrated options for that.

So if we are not using the firewall method, we might need to introduce a data owner guideline/interface that helps them to create different versions of a vdc.

jose-sanchezm commented 5 years ago

Hi @tawalaya , all the exposed VDC methods are defined in the section "EXPOSED_API" of the blueprint and are uniquely identified with the "operationId" (e.g. GetHistoricalData). Thid id is also used as "method_id" to uniquely identify the methods in other sections (e.g. INTERNAL_STRUCTURE, DATA_MANAGEMENT) of the blueprint. So, if a method is not selected by the application designer, this means that it will be deleted from all the sections that it appears and yes the response to a call to that method should be 403. The blueprint (intermediate blueprint to be precise) that contains only the selected method(s) would be the final outcome of the whole Resolution process and thus is forwarded to the Deployment Engine. So it will be available to all the components of a running VDC such as the VDC-Request-Monitor.

The proposed solution will raise issues a) regarding the method filtering that POLIMI would like to introduce b) will change the intermediate blueprint c) will change a bit the resolution process d) and destroys the idea that the blueprint could have an orchestration tool that could manage the services and the containers. Having a firewall-like-approach minimizes the functionalities of the blueprint and creates a new effort to restructure our services. @tawalaya @achilleasmarinakis @jose-sanchezm @AitorF it is preferable to solve that issue in the deployment phase and not alter the services flow in wp3 phase. I think if we continue with @AitorF and @jose-sanchezm plan to ignore the flow from the blueprint we should accept the fact that the method elimination will not work. It is complex to have extra interface to select methods and introduce a new file that will send to the deployment engine and after that to monitoring component in order to block the methods. The goal was to lightweight the blueprint and not to block the methods which is unnecessary. On top of that, we will have extra overhead to a complicated architecture.

I fail to see how having less methods in a blueprint destroys the posibility of orchestrating containers and services since for the deployment engine, the number of methods is irrelevant. A blueprint is identified by its name and it will create VDCs or clusters based only on that. Modifying the container at deployment time, or putting custom files in custom places for containers depending on the technology they use limits a lot the technologies that can be used to implement the processing of a VDC. Node-Red is a special case in which you can provide a json file that will define its behavior but it's the exception here and not the norm. Even with that, the Node-Red image that we have is not generic at all and it's tailored specifically for Ideko use case, since it contains custom nodes and it's impossible to provide a generic-enough image that contains all possible node classes that might be used by anyone. Also the approach of putting "something" in a section of the blueprint that must go "somewhere" in a container is vague enough to not be useful at all. Implementing a new "something" will mean modifying the source code of the deployment engine to account for that new special case and then it won't be generic enough anyway. It might not be even feasible at all for processing engines based on compiled code like Spark. If we want to go for a demo of a product that satisfies our particular use cases, fair enough but I'd prefer to present a product that it's able to deploy and satisfy as many use cases as possible and providing container image references is something that's way more generic than having a lot of special cases into account. Regarding the implementation based on the example by Sebastian, it's quite easy and quick to implement.

Regarding the interface to select methods, if you want to somehow make the blueprint lighter by removing things (be it in the DATA_MANAGEMENT or FLOW section) you need a interface to do it anyway, it's just changing the place that you modify so I don't see why modifying the flow is less work for you than modifying the DATA_MANAGEMENT section.