pkoukk / tiktoken-go

go version of tiktoken
MIT License
601 stars 67 forks source link

tiktoken-go getEmbedding isn't thread-safe #23

Closed tbiehn closed 1 year ago

tbiehn commented 1 year ago

Howdy, There's potential concurrent access to the ENCODING_MAP in the getEncoding function here;

func getEncoding(encodingName string) (*Encoding, error) { encoding, ok := ENCODING_MAP[encodingName] if !ok { initEncoding, err := initEncoding(encodingName) if err != nil { return nil, err } encoding = initEncoding ENCODING_MAP[encodingName] = encoding } return encoding, nil }

There may be some other issues in the package that make it unsafe to run in multiple go-routines - which isn't expected since we're picking up unique instances via tiktoken.EncodingForModel(model). Might want to move this (and other) globals into a struct.