IAPark / tiktoken_ruby

Unofficial ruby binding for tiktoken by way of rust
MIT License
118 stars 26 forks source link

Fix memoization thread safety #30

Closed mkdynamic closed 6 months ago

mkdynamic commented 6 months ago

I haven’t tested this (edited this PR on my phone) but in theory this should fix the thread safety issue.

ScotterC commented 6 months ago

Thanks for the contribution! I haven't had much experience with writing a spec for thread safety issues. Have a sense of the outline you'd approach it with?

mkdynamic commented 6 months ago

Unit testing reliably this would very tricky. IMO it’s not worth it, and you’d mainly end up testing the Ruby Mutex works, which seems pointless.

This pattern of using a mutex to synchronize a memoized class variable in Ruby is common enough — I would say if the existing test suite passes, call it good.

IAPark commented 6 months ago

Could you run rake standard:fix?

I'm guessing we could do this with less locking, maybe eager looking in the cache before locking, but probably safest to just use a mutex for the whole thing

As to testing I agree that there aren't really great options. I think I've mostly heard of people using locks under the hood to make sure the race condition happens, but I think I'm comfortable if all the specs pass with this.

andreibondarev commented 6 months ago

Relevant issue and comment: https://github.com/andreibondarev/langchainrb/issues/446#issue-2098902544