opencdms-dev / legacy-opencdms-api

⭐🐍 OpenCDMS server application written in Python (FastAPI) and exposing a web interface for `opencdms-app` and other applications
MIT License
3 stars 3 forks source link

Server and client-side validation #10

Open isedwards opened 2 years ago

isedwards commented 2 years ago

When a user interacts with client-side forms - we need to run validation rules in JavaScript that provide immediate feedback and errors before the data is submitted via API.

What would be the best way to write this validation code once (in JavaScript) - but then to use is both in the client and on the server (with Python/FastAPI)?

I've thought through a number of possibilites - but nothing stands out to me as obviously being the best approach given that we're using Python on the backend.

faysal-ishtiaq commented 2 years ago

@isedwards

FastAPI with Pydantic generates OpenAPI specifications out of the box. Also we can generate json-schema for Pydantic Models. If necessary, we can serve json-schemas for all the pydantic models so that front-end can use appropriate schema in appropriate form. Here is a little demo:

Code:

from src.apps.climsoft.schemas.station_schema import Station
Station.schema()

Output:

{
  "title": "Station",
  "type": "object",
  "properties": {
    "station_id": {
      "title": "Station Id",
      "maxLength": 255,
      "type": "string"
    },
    "station_name": {
      "title": "Station Name",
      "maxLength": 255,
      "type": "string"
    },
    "wmoid": {
      "title": "Wmoid",
      "maxLength": 20,
      "type": "string"
    },
    "icaoid": {
      "title": "Icaoid",
      "maxLength": 20,
      "type": "string"
    },
    "latitude": {
      "title": "Latitude",
      "type": "number"
    },
    "qualifier": {
      "title": "Qualifier",
      "maxLength": 20,
      "type": "string"
    },
    "longitude": {
      "title": "Longitude",
      "type": "number"
    },
    "elevation": {
      "title": "Elevation",
      "maxLength": 255,
      "type": "string"
    },
    "geolocation_method": {
      "title": "Geolocation Method",
      "maxLength": 255,
      "type": "string"
    },
    "geolocation_accuracy": {
      "title": "Geolocation Accuracy",
      "type": "number"
    },
    "opening_datetime": {
      "title": "Opening Datetime",
      "type": "string",
      "format": "date-time"
    },
    "closing_datetime": {
      "title": "Closing Datetime",
      "type": "string",
      "format": "date-time"
    },
    "country": {
      "title": "Country",
      "maxLength": 50,
      "type": "string"
    },
    "authority": {
      "title": "Authority",
      "maxLength": 255,
      "type": "string"
    },
    "admin_region": {
      "title": "Admin Region",
      "maxLength": 255,
      "type": "string"
    },
    "drainage_basin": {
      "title": "Drainage Basin",
      "maxLength": 255,
      "type": "string"
    },
    "waca_selection": {
      "title": "Waca Selection",
      "type": "boolean"
    },
    "cpt_selection": {
      "title": "Cpt Selection",
      "type": "boolean"
    },
    "station_operational": {
      "title": "Station Operational",
      "type": "boolean"
    }
  },
  "required": [
    "station_id",
    "station_name",
    "wmoid",
    "icaoid",
    "latitude",
    "qualifier",
    "longitude",
    "elevation",
    "geolocation_method",
    "geolocation_accuracy",
    "opening_datetime",
    "closing_datetime",
    "country",
    "authority",
    "admin_region",
    "drainage_basin",
    "waca_selection",
    "cpt_selection",
    "station_operational"
  ]
}

Later, we can use this in javascript using this module: https://www.npmjs.com/package/jsonschema

Here is an example:

var schema = {
    "title": "Station",
    "type": "object",
    "properties": {
        "station_id": {
            "title": "Station Id",
            "maxLength": 255,
            "type": "string"
        },
        "station_name": {
            "title": "Station Name",
            "maxLength": 255,
            "type": "string"
        },
        "wmoid": {
            "title": "Wmoid",
            "maxLength": 20,
            "type": "string"
        },
        "icaoid": {
            "title": "Icaoid",
            "maxLength": 20,
            "type": "string"
        },
        "latitude": {
            "title": "Latitude",
            "type": "number"
        },
        "qualifier": {
            "title": "Qualifier",
            "maxLength": 20,
            "type": "string"
        },
        "longitude": {
            "title": "Longitude",
            "type": "number"
        },
        "elevation": {
            "title": "Elevation",
            "maxLength": 255,
            "type": "string"
        },
        "geolocation_method": {
            "title": "Geolocation Method",
            "maxLength": 255,
            "type": "string"
        },
        "geolocation_accuracy": {
            "title": "Geolocation Accuracy",
            "type": "number"
        },
        "opening_datetime": {
            "title": "Opening Datetime",
            "type": "string",
            "format": "date-time"
        },
        "closing_datetime": {
            "title": "Closing Datetime",
            "type": "string",
            "format": "date-time"
        },
        "country": {
            "title": "Country",
            "maxLength": 50,
            "type": "string"
        },
        "authority": {
            "title": "Authority",
            "maxLength": 255,
            "type": "string"
        },
        "admin_region": {
            "title": "Admin Region",
            "maxLength": 255,
            "type": "string"
        },
        "drainage_basin": {
            "title": "Drainage Basin",
            "maxLength": 255,
            "type": "string"
        },
        "waca_selection": {
            "title": "Waca Selection",
            "type": "boolean"
        },
        "cpt_selection": {
            "title": "Cpt Selection",
            "type": "boolean"
        },
        "station_operational": {
            "title": "Station Operational",
            "type": "boolean"
        }
    },
    "required": [
        "station_id",
        "station_name",
        "wmoid",
        "icaoid",
        "latitude",
        "qualifier",
        "longitude",
        "elevation",
        "geolocation_method",
        "geolocation_accuracy",
        "opening_datetime",
        "closing_datetime",
        "country",
        "authority",
        "admin_region",
        "drainage_basin",
        "waca_selection",
        "cpt_selection",
        "station_operational"
    ]
}

var station = {
    "station_id": "9f8e1c453ca04b9a8c72fddb2d186c7f", 
    "station_name": "e39d738a88674b5aaa57542251eb0605", 
    "wmoid": "df3aa52e164547229e43", 
    "icaoid": "5bac8d04813245bc9e9b", 
    "latitude": 40.72816, 
    "qualifier": "fdc99c4ecddb4754901a", 
    "longitude": -74.07764, 
    "elevation": "22", 
    "geolocation_method": "740c3713863a4b9da252f398cdb2a113", 
    "geolocation_accuracy": 0.9832581768123831, 
    "opening_datetime": "2021-11-15T20:51:39Z", 
    "closing_datetime": "2024-08-10T20:51:39Z", 
    "country": "US", 
    "authority": "38a0e51448704350a393f5dc0628cf26", 
    "admin_region": "US", 
    "drainage_basin": "32058d0ec6ff4df6885f02399434e526", 
    "waca_selection": true, 
    "cpt_selection": true, 
    "station_operational": true
}

var Validator = require('jsonschema').Validator;
var validator = new Validator();

console.log(validator.validate(station, schema))

Output:


ValidatorResult {
  instance: {
    station_id: '9f8e1c453ca04b9a8c72fddb2d186c7f',
    station_name: 'e39d738a88674b5aaa57542251eb0605',
    wmoid: 'df3aa52e164547229e43',
    icaoid: '5bac8d04813245bc9e9b',
    latitude: 40.72816,
    qualifier: 'fdc99c4ecddb4754901a',
    longitude: -74.07764,
    elevation: '22',
    geolocation_method: '740c3713863a4b9da252f398cdb2a113',
    geolocation_accuracy: 0.9832581768123831,
    opening_datetime: '2021-11-15T20:51:39Z',
    closing_datetime: '2024-08-10T20:51:39Z',
    country: 'US',
    authority: '38a0e51448704350a393f5dc0628cf26',
    admin_region: 'US',
    drainage_basin: '32058d0ec6ff4df6885f02399434e526',
    waca_selection: true,
    cpt_selection: true,
    station_operational: true
  },
  schema: {
    title: 'Station',
    type: 'object',
    properties: {
      station_id: [Object],
      station_name: [Object],
      wmoid: [Object],
      icaoid: [Object],
      latitude: [Object],
      qualifier: [Object],
      longitude: [Object],
      elevation: [Object],
      geolocation_method: [Object],
      geolocation_accuracy: [Object],
      opening_datetime: [Object],
      closing_datetime: [Object],
      country: [Object],
      authority: [Object],
      admin_region: [Object],
      drainage_basin: [Object],
      waca_selection: [Object],
      cpt_selection: [Object],
      station_operational: [Object]
    },
    required: [
      'station_id',          'station_name',
      'wmoid',               'icaoid',
      'latitude',            'qualifier',
      'longitude',           'elevation',
      'geolocation_method',  'geolocation_accuracy',
      'opening_datetime',    'closing_datetime',
      'country',             'authority',
      'admin_region',        'drainage_basin',
      'waca_selection',      'cpt_selection',
      'station_operational'
    ]
  },
  options: {},
  path: [],
  propertyPath: 'instance',
  errors: [],
  throwError: undefined,
  throwFirst: undefined,
  throwAll: undefined,
  disableFormat: false
}
isedwards commented 2 years ago

Would json-schema support arbitrary and quite advance validation rules in addition to type/format/length?

A simple exampe would be closing_datetime >= opening_datetime.

faysal-ishtiaq commented 2 years ago

Would json-schema support arbitrary and quite advance validation rules in addition to type/format/length?

A simple exampe would be closing_datetime >= opening_datetime.

I am not sure about it. I have to check, how complex tests like these are converted to json schema from python pydantic models and if they work in js.

isedwards commented 2 years ago

My assumption is that the JSON schema would just contain the validator name, e.g. closing_gte_opening and not the actual logic performed (which could be arbitrarily complex)?

faysal-ishtiaq commented 2 years ago

@isedwards I have just tested this approach for a complex validation scenario.

In [2]: from pydantic import BaseModel, ValidationError, validator

In [3]: class UserModel(BaseModel):
   ...:     name: str
   ...:     username: str
   ...:     password1: str
   ...:     password2: str
   ...: 
   ...:     @validator('name')
   ...:     def name_must_contain_space(cls, v):
   ...:         if ' ' not in v:
   ...:             raise ValueError('must contain a space')
   ...:         return v.title()
   ...: 
   ...:     @validator('password2')
   ...:     def passwords_match(cls, v, values, **kwargs):
   ...:         if 'password1' in values and v != values['password1']:
   ...:             raise ValueError('passwords do not match')
   ...:         return v
   ...: 
   ...:     @validator('username')
   ...:     def username_alphanumeric(cls, v):
   ...:         assert v.isalnum(), 'must be alphanumeric'
   ...:         return v
   ...: 

In [4]: UserModel.schema()
Out[4]: 
{'title': 'UserModel',
 'type': 'object',
 'properties': {'name': {'title': 'Name', 'type': 'string'},
  'username': {'title': 'Username', 'type': 'string'},
  'password1': {'title': 'Password1', 'type': 'string'},
  'password2': {'title': 'Password2', 'type': 'string'}},
 'required': ['name', 'username', 'password1', 'password2']}

Here some complex validation rules are defined through @validator decorator of pydantic. Pydantic generates json schema for this model. But no rule defined in @validator is generated.

faysal-ishtiaq commented 2 years ago

json-schema supported validation rules: https://json-schema.org/draft/2020-12/json-schema-validation.html

faysal-ishtiaq commented 2 years ago

@isedwards

I have found a way to extract validator names. Here is a sample code:

import inspect
import json

from pydantic import BaseModel, validator
from pydantic.class_validators import Validator

def extract_validators(model):
    return {name: extract_validator_function_name(func[0]) for name, func in model.__validators__.items()}

def extract_validator_function_name(validator_func: Validator):
    return next(filter(lambda x: inspect.isfunction(x[1]), inspect.getmembers(validator_func)))[1].__name__

def generate_schema_with_validators(model):
    return {**model.schema(), "validators": extract_validators(model=model)}

if __name__ == "__main__":
    class UserModel(BaseModel):
        name: str
        username: str
        password1: str
        password2: str

        @validator('name')
        def name_must_contain_space(cls, v):
            if ' ' not in v:
                raise ValueError('must contain a space')
            return v.title()

        @validator('password2')
        def passwords_match(cls, v, values, **kwargs):
            if 'password1' in values and v != values['password1']:
                raise ValueError('passwords do not match')
            return v

        @validator('username')
        def username_alphanumeric(cls, v):
            assert v.isalnum(), 'must be alphanumeric'
            return v

    print(json.dumps(generate_schema_with_validators(UserModel), indent=2))

It prints

{
  "title": "UserModel",
  "type": "object",
  "properties": {
    "name": {
      "title": "Name",
      "type": "string"
    },
    "username": {
      "title": "Username",
      "type": "string"
    },
    "password1": {
      "title": "Password1",
      "type": "string"
    },
    "password2": {
      "title": "Password2",
      "type": "string"
    }
  },
  "required": [
    "name",
    "username",
    "password1",
    "password2"
  ],
  "validators": {
    "name": "name_must_contain_space",
    "password2": "passwords_match",
    "username": "username_alphanumeric"
  }
}