Closed janvanbouwel closed 1 year ago
I am having the same trouble on linux armv7l
infrastructure with java code that was working previously on 1.8.4. For now, neither of 1.8.4 and 1.8.10 is working for me and the moment I send listener.connect
from one of the clients the server crashes with one of the malloc()
errors reported above. I am not sure if the submodules from 1.8.10 were used on 1.8.4 by mistake when I switched the branch. I will try a fresh build using 1.8.4 and report my results. Thanks for reporting this!
I did a git bisect on the ZTO submodule. (Found the free error again, and another one:
double free or corruption (!prev)
corrupted size vs. prev_size while consolidating
)
In some revisions the errors seemed to be less consistent (say 1 in 2 instead of 5 in 6 runs), but the first breaking commit appears to be https://github.com/zerotier/ZeroTierOne/commit/e9375b50b0b7228f77dc8ef3d99d1021f89675ff.
I don't have time right now to look at it more, but it's a start.
I confirmed today that using the v1.8.4 submodule fixed the issue for me which is weird because v1.8.10 was working for me in November which seems to be after the submodule was updated to v1.8.10 last May. Hoping there are no breaking fixes to the java binding since then which could surface later on for me...
Promising lead:
The aforementioned commit https://github.com/zerotier/ZeroTierOne/commit/e9375b50b0b7228f77dc8ef3d99d1021f89675ff sets
#define ZT_MAX_PEER_NETWORK_PATHS 64
while in libzt there is:
This seems to be a discrepancy (although they have a slightly different name). I'll test in about 2 hours whether setting the latter to 64 fixes the issue.
Good news, not tested super extensively but this seems to fix it, at least for Java!
For Rust I assume the same change has to be made, in:
For Java, rather confusingly, this is also included in https://github.com/zerotier/libzt/blob/8d21a265cc23dd6e6e4d2c2ad068e978f110f8e3/src/bindings/java/ZeroTierNative.java#L410 but the only reference to it is https://github.com/zerotier/libzt/blob/8d21a265cc23dd6e6e4d2c2ad068e978f110f8e3/src/bindings/java/ZeroTierPeerDetails.java#L60 and this class doesn't seem to be used.
I have found three other inconsistencies in ZeroTierSockets.h
The first two: https://github.com/zerotier/libzt/blob/8d21a265cc23dd6e6e4d2c2ad068e978f110f8e3/include/ZeroTierSockets.h#L327-L336
In https://github.com/zerotier/ZeroTierOne/commit/440568a5165181c5ae602e4edca2a73202bfd453 these were changed to 128 and 32 respectively.
which is set to 4096 in ZeroTierOne, although this one has been there since it was introduced, and may be intentional.
As far as I'm aware I've not run into issues with these, but they could form a problem.
I'm gonna test a bit more (and I'll try to get to the Rust version as well), and if it works I'll set up a PR.
It does seem strange to me that these constants are redefined and the ones in https://github.com/zerotier/ZeroTierOne are not used, but I'm not that skilled at C to know the reasoning behind it. Maybe a project maintainer could shine some light on this?
which is weird because v1.8.10 was working for me in November
As it is a memory issue, maybe your network topology changed?
Yeah these constants should match. The fact that they currently don't is just that I must have forgotten to update them in this project when we bumped up the number of allowed paths in ZeroTierOne
. I'll see about either updating them or just using the regular ZeroTierOne
constants.
The reasoning was that we wanted to use a smaller subset of the definitions from ZeroTierOne.h
since may of them shouldn't be exposed alongside libzt's API
I'm going to close this issue since your PR addresses this. Feel free to open again if the issue persists.
I've been trying to integrate libzt in a project, but I have issues when trying to use the latest versions. Version 1.8.4 seems to work fine. Version 1.8.10 crashes the moment I try to do anything network related, like pinging the program or creating a socket. Very rarely I am able to ping it, in which case it works for at most a minute before still crashing. The message I get is one of the following two. (
free()
has also been mentioned, but not as often, and I don't remember the exact message.)This happens in exactly the same way with both Rust and Java bindings. My system is
Linux 5.15.0-58-generic #64-Ubuntu SMP x86_64 x86_64 x86_64 GNU/Linux
, runningUbuntu 22.04.1 LTS
.The breaking commit appears to be
0bf302a581f641b1ac5a1915e4cc59cde4ae8cdb
, which updates the ZTO submodule to v1.8.10. Before that it works. Later commits and later versions don't fix the problem.For now I can continue using v1.8.4, but I was considering contributing and that's difficult if I can't try the latest version.