sqids / sqids-go

Official Go port of Sqids. Generate short unique IDs from numbers.
https://sqids.org/go
MIT License
536 stars 11 forks source link

blocklist: Short blocked words seems to not work correctly. #11

Closed peterhellberg closed 1 year ago

peterhellberg commented 1 year ago

This logic seems to be wrong, if you for example want to block the word lad and the id is longer than 3 characters: https://github.com/sqids/sqids-go/blob/05bcc62405932a1e046f6bf01c9118d98ae2467d/sqids.go#L338

4kimov commented 1 year ago

This strange-looking logic comes from writing a lot of scripts to try and find the balance between visually not producing inappropriate ids and at the same time not having too many blocked matches (because they're expensive, since encoding would restart every time).

If I remember correctly, the 3 char limit comes from getting too many matches. Basically if you have a 3 char word and you're trying to match it as a substring, it'll match so often (esp the higher you go) that the loop will rarely exit, or eventually throw an overflow error. That's where the logic to block it completely or not at all comes from (if <= 3)

peterhellberg commented 1 year ago

Ok, was just surprised that with the latest tag (v0.2.0) a word such as Ass can not be blocklisted:

package main

import (
    "fmt"
    "strings"

    "github.com/sqids/sqids-go"
)

func main() {
    s, err := sqids.NewCustom(sqids.Options{
        Blocklist: &[]string{"Ass"},
    })
    if err != nil {
        panic(err)
    }

    for i := uint64(0); i < 1_000_000; i++ {
        id, err := s.Encode([]uint64{i})
        if err == nil {
            if strings.Contains(id, "Ass") {
                fmt.Printf("[]uint64{%d} -> %q\n", i, id)
            }
        }
    }
}
$ go run main.go 
[]uint64{187613} -> "UAss"
[]uint64{205424} -> "RAssT"
[]uint64{297385} -> "raAss"
[]uint64{414299} -> "eAsss"
[]uint64{623217} -> "6AssR"
[]uint64{690438} -> "OTAss"
[]uint64{832166} -> "gAssl"
[]uint64{917063} -> "COAss"
4kimov commented 1 year ago

Yep, that's right - these would be the results. I'd enjoy seeing a more acceptable blocklist filter, but so far I struggle to come up with anything better. If anyone has suggestions, please open an issue in the spec repo.