daulet / tokenizers

Go bindings for HuggingFace Tokenizer
MIT License
92 stars 23 forks source link

segfault running example main.go #16

Closed jaybinks closed 3 months ago

jaybinks commented 11 months ago
jaybinks@Jays-Mac-mini example % go run main.go
Vocab size: 30522
SIGSEGV: segmentation violation
PC=0x104359c24 m=0 sigcode=2
signal arrived during cgo execution

goroutine 1 [syscall]:
runtime.cgocall(0x104313b74, 0x14000052ca8)
    /usr/local/go/src/runtime/cgocall.go:157 +0x44 fp=0x14000052c70 sp=0x14000052c30 pc=0x10428e354
github.com/daulet/tokenizers._Cfunc_encode(0x129904660, 0x600001794000, 0x0)
    _cgo_gotypes.go:132 +0x38 fp=0x14000052ca0 sp=0x14000052c70 pc=0x104312a08
github.com/daulet/tokenizers.(*Tokenizer).Encode.func2(0x12?, 0x1?, 0x80?)
    /Users/jaybinks/go/pkg/mod/github.com/daulet/tokenizers@v0.6.0/tokenizer.go:60 +0x64 fp=0x14000052d10 sp=0x14000052ca0 pc=0x104313394
github.com/daulet/tokenizers.(*Tokenizer).Encode(0x104892ea8?, {0x10463a583?, 0x14000052f00?}, 0x2?)
    /Users/jaybinks/go/pkg/mod/github.com/daulet/tokenizers@v0.6.0/tokenizer.go:60 +0x8c fp=0x14000052e40 sp=0x14000052d10 pc=0x10431302c
main.main()
    /Users/jaybinks/src/tokenizers/example/main.go:18 +0xd0 fp=0x14000052f30 sp=0x14000052e40 pc=0x1043137e0
runtime.main()
    /usr/local/go/src/runtime/proc.go:267 +0x2bc fp=0x14000052fd0 sp=0x14000052f30 pc=0x1042bde0c
runtime.goexit()
    /usr/local/go/src/runtime/asm_arm64.s:1197 +0x4 fp=0x14000052fd0 sp=0x14000052fd0 pc=0x1042e92e4

ive tried this on my clean mac-mini M2 as well as on an intel mac, with the crash both on line 18 on the first call to tk.Encode()

libtokenizers.a was built with "make build" go version go1.21.1 darwin/arm64

RJKeevil commented 10 months ago

This library has a requirement that liblokenizers.a must be placed in the module's SRCDIR, which in your case is /Users/jaybinks/go/pkg/mod/github.com/daulet/tokenizers@v0.6.0/. Did you place your libtokenizer.a file in that path? Also check your versions, current is 0.7.0 and your logs say 0.6.0. The go and rust files must be in sync else you tend to experience segment violations.

I have a PR to make this path configurable but it hasnt been merged.

daulet commented 3 months ago

Please reopen if this is still an issue