coralproject / shelf

Open source services powering site communities (comments, forms, etc.).
Other
47 stars 7 forks source link

Required/validation for query_set variables with descriptive logging/error response #26

Closed jde closed 8 years ago

jde commented 8 years ago

Apply basic validation to query_set variables, defined in the query_set document.

Validation options:

When validation fails, pass variable requirements back to client in the error message and log error. This will be essential to fe debugging flows and our ability to recognize users who are improperly using our services.

ardan-bkennedy commented 8 years ago

We do have the beginnings of this. A query set has a field called params to validate all the variables required are passed. It would not take much to extend this with extra validation.

ardan-bkennedy commented 8 years ago

@jde I want to cerate a collection of regular expressions that can be updated by the tooling. Something simple like this

alpha                = "^[a-zA-Z]+$"
alphaNumeric   = "^[a-zA-Z0-9]+$"
numeric            = "^[-+]?[0-9]+(?:\\.[0-9]+)?$"

Then provide that name as part of the parameters. If a parameter is listed it is always required.

The problem is I want to cache these because the extra call to the DB for every query is silly. This is a problem I need to solve for other things but this is at least tied to tooling.

Caching is not the problem. When to update the cache is. I think the cache could be reloaded every X minutes but we could expose an endpoint as well to force the cache to the reloaded. In the multi-system environment that idea breaks down as well.

ardan-bkennedy commented 8 years ago

The other idea is to build the list just for the tooling and ask them to place the regex inside the query document. This eliminates lookups and gives the tooling the support.

ardan-bkennedy commented 8 years ago

Created a benchmark to learn more about regex

package query_test

import (
    "regexp"
    "testing"
)

var s = "BillKennedy"
var bv bool

var rg1 = regexp.MustCompile("^[a-zA-Z]+$")

var mp = map[string]*regexp.Regexp{
    "^[a-zA-Z]+$": rg1,
}

func Benchmark1(b *testing.B) {
    for i := 0; i < b.N; i++ {
        rg := regexp.MustCompile("^[a-zA-Z]+$")
        bv = rg.Match([]byte(s))
    }
}

func Benchmark2(b *testing.B) {
    for i := 0; i < b.N; i++ {
        bv, _ = regexp.Match("^[a-zA-Z]+$", []byte(s))
    }
}

func Benchmark3(b *testing.B) {
    for i := 0; i < b.N; i++ {
        bv = rg1.Match([]byte(s))
    }
}

func Benchmark4(b *testing.B) {
    for i := 0; i < b.N; i++ {
        rg2 := mp["^[a-zA-Z]+$"]
        bv = rg2.Match([]byte(s))
    }
}

$ go test -run none -bench . -benchmem
PASS
Benchmark1-8      200000          6848 ns/op        3824 B/op         54 allocs/op
Benchmark2-8      200000          6997 ns/op        3824 B/op         54 allocs/op
Benchmark3-8     3000000           509 ns/op          16 B/op          1 allocs/op
Benchmark4-8     3000000           503 ns/op          16 B/op          1 allocs/op
ok      github.com/coralproject/xenia/pkg/query 7.204s

Looks like we can create a map of regex strings, hence compile that one time and use map lookup. That is benchmark 4. This goes back to loading the map from the DB and leveraging this as a cache. The key again is when does this reload itself and how.

ardan-bkennedy commented 8 years ago

Adding a regex package with CRUD support. Then I will work on issue 45 to provide support to use these regexes.