Closed pwaller closed 7 years ago
I wrote this issue on the assumption that c-for-go expects you to null terminate the strings you pass to the generated API. I actually can't find any advice on that front, but observe that all sorts of things don't work unless you null-terminate the strings you feed into the API. So maybe it's supposed to work and I'm encountering a bug, instead.
Hi!
Yes that's a common mistake. I usually address that when creating a wrapper for the generated code,
when I want to simplify the API and introduce objects. As an example see String
https://github.com/xlab/pocketsphinx-go/blob/master/sphinx/helpers.go#L42 which is used to make
params safe https://github.com/xlab/pocketsphinx-go/blob/master/sphinx/ngram.go#L316 so user will
working with sphinx.String
and know that it's safe.
I wrote this issue on the assumption that c-for-go expects you to null terminate the strings you pass to the generated API.
Yes, this is so by design. The point there is that c-for-go generates its binding on the API
interface level, not the data level. I other worlds, it does rely on types and method signatures but
doesn't go into data context. It doesn't modify or copy data except when unavoidable.
Passing raw things like string
or []byte
is a 0-allocation operation, memory directly referenced
from Go to C.
This is mainly because I wanted to avoid any overhead and hidden effects. Many C functions out there are waiting for non-null terminated strings, knowing their length. And if they require the null-termination, that's up to user to decide how to properly do that. Otherwise you will introduce an extra allocation in the generated code, what I tried to avoid.
It's about safety VS performance, I chose the latter to be the base layer, and the safety can be enforced later as an additional layer written by humans, for humans.
Yes that's a common mistake.
It might be worth documenting it clearly somewhere. I scoured the docs a few times and didn't find anything.
The problem though is that even knowing what is correct I still make the mistake.
It's about safety VS performance, I chose the latter to be the base layer, and the safety can be enforced later as an additional layer written by humans, for humans.
I understand your motivation. However, for strings specifically, this means that the "additional layer" would have to be almost as big as the first layer. Which seems to me to defeat the idea of doing code generation in the first place. See for instance the python C-API. It's huge, and very many functions take strings, all of which must be 0-terminated. https://docs.python.org/3/c-api/index.html
I'm not asking you to change the default, I'm asking if there is room in the generator to automate this with a new PtrType. Any thoughts on that? It would eliminate a lot of manual and error-prone work.
I agree. I will add this as a global preference of generator.
Great. Though I hope it is still possible to override it in specific cases where performance is a concern, or for specific cases which require null terminated strings.
@pwaller the override can be achieved by taking out the code into a separate file and building the hand-made optimisations upon generated code.
I'm not making this feature into the tip system, because it's very easy to create a mess of strings that are null-terminated automatically and other cases that require manual null-termination. So in order to use API safely, the user will be forced to look into guts of binding code (not the interface, but cgo_helpers.go
) and study whether the strings are terminated in some particular functions. This is wrong.
How about introducing a new type:
type CString string
Then it would be obvious at the point of use that you're doing something strange:
cpkg.Function(cpkg.CString(s))
The type cast should be "free" but annotates what is happening in a fairly obvious manner.
(and is impossible to accidentally miss because the type-checker will complain)
Type-cast won't guarantee safeness, this doesn't contribute to solution but simply makes the resulting API harder to use. I want to use string
, it's a part of the language.
We don't need annotations there, its either safe strings across all package, or not.
@pwaller your wish is granted. I also re-generated my Nuklear package for UI which also benefits a lot from the automatic null termination approach.
GENERATOR:
Options:
SafeStrings: true
happy hacking! 🍻
Looks great, thanks :)
your wish is granted.
Thanks for adding this. Took a long time to find this issue - but I'm pleased to see that I don't have to NUL-terminate my Go strings!
Hi!
I've had some success automatically generating bindings to the CPython API. A common mistake I'm making though in using the generated bindings is forgetting to append a null character to strings that I feed in. Stuff breaks in interesting and non-obvious ways when you do that.
What do you think about having a PtrTips which allows specifying that the generated bindings should append a null to the passed string before passing it onwards as a null pointer?
I have contemplated writing more go-idiomatic wrappers which append the null, but this really is against the spirit of automatically generating bindings... :)