jmhodges / gocld3

Apache License 2.0
20 stars 18 forks source link

Invalid memory access #4

Closed sk- closed 5 years ago

sk- commented 5 years ago

The code in https://github.com/jmhodges/gocld3/blob/master/cld3/cld3.cc#L23-L25 has a bug as the address is not stable.

It says:

These strings are statically allocated, so we can do this c_str() without worrying about them going off the stack.`

However, although the const char* strings are indeed static, cld3 returns a new std::string every single time, so it's needed to copy the result, or otherwise you can access garbage.

jmhodges commented 5 years ago

Argh, I missed the implicit conversion. Thanks for the heads up.

I'll get a patch up to fix the immediate problem.

I'm also considering hacking into the Result struct and FindLanguage to get the language_id integer propagated back up to gocld3 for a look up there. We've already got one copy of the language string happening in the Go side of FindLanguage (the C.GoStringN call), and yet another copy and allocation for these tiny strings is annoying.

sk- commented 5 years ago

Great. Take a look at the changes I made in this gRPC service. https://github.com/sk-/weslango/tree/master/third_party/gocld3

I did not want to make any changes in the cld3 library though.

jmhodges commented 5 years ago

Ah, the commits are very large in that repo. What exactly would you like to me to see?

sk- commented 5 years ago

You can take a look at the changes I made for the gocld3 part. I added a bunch of tests, normalized the output, by always returning a 3 letter ISO code and fixing the one used for Hebrew. Moving the Latin qualifier to a struct field. Also simplified the error checking by using signed types.