Open isaacsanders opened 5 years ago
Hi @isaacsanders. Blindly copying all binaries is an anti-pattern because it causes unnecessary copying. This is especially true in newer versions of the BEAM VM where binaries can be more efficiently handled. If we want to support this it should via schema annotations so that we only copy specific binaries that are causing issues.
Silently accruing refc binaries also feels like an anti-pattern that my team's use case has to always work around. We prefer to copy because our fields are almost always smaller than the refc limit.
For reference see the guide line comment in the docs (http://erlang.org/doc/man/binary.html#copy-1) by the OTP team. We could definitely support an annotation at any level of a schema to binary copy anything underneath that. This might be per file, per struct, per field etc.
Yeah, we’ve read that. We have GB size/volume of binaries that were getting parsed by JSON libraries and not copying, which were handled by long lived processes. Most of our data is from uninterruptible streaming processes. Most of the Elixir community thinks about their performance differently than we do, in the three years we’ve been using it.
This is an advanced request. I know this is your due diligence.
Sure for clarity we don't want to pay the runtime overhead of always copy by default, checking an option at runtime to handle this, or generating twice the code. Once we have a new AST will try to fill do this feature as proof that AST/annotations works as we hope ;).
No worries, I hope it didn’t seem I was asking for a default. :)
If you want to fork in the meantime, I think enforcing the copy is changing 5-10 lines in https://github.com/pinterest/elixir-thrift/blob/master/lib/thrift/generator/struct_binary_protocol.ex.
Hey all,
I have been working with this library for the past month or so, and it recently it occurred to me that the thrift messages may be using references to sub-binaries.
this is potentially fine for most people, but If one is process larger amounts of data and they want to copy the binaries instead, there isn't really a recourse for that currently.
I don't know if anyone who works on this cares, so if not, let me know and I might work on it myself at some point.