Open bramp opened 2 years ago
I searched for the string, and mtr-packet got an unexpected error and has passed that on to the main program.
If you comment out the line 652 in cmdpipe.c then it shouldn't stop. On the other hand the code after line 688 will assume the program has quit when there has been an error. Ah. A few lines down it will be ignored.
This should tell us if the error is a really fatal one. Does it occasionally go wrong or has something gone bad and it will never recover?
On line 652 if you put a fprint(stderr,...f of reply->argument_value[0] that might give a hint as to what's going on.
Sorry for the slow response it, I compiled the code, adding additional debugging statements... However since then, I've been unable to reproduce the mtr: Unexpected mtr-packet error
issue. I'd be happy to send a PR to more permanently log more useful information in this situation.
No worries.
Outputting the error code would at least give us a hint when things go wrong.
One of my frustrations is that windows (at least at one point in time) said "could not display page" when something whent wrong with displaying a web page. That could be "out of memory while rendering the page" or "your interface is down" or "no route to host" or "connection refused at the destination". Each, if that had been the error message, would elicit a different response in trying to fix the problem.
Linux is usually a lot beter giving a hint as to what's wrong. mtr should try not to deviate from the pattern. :-)
More useful information would be better, yes a PR would be appreciated.
Ok, finally after days of trying to repeat this, I got one error "errno = 55" (No buffer space available). I don't know if that's my original problem, or a new one, but it's at-least an unhandled error. A quick google search, shows this seems to be some weird condition with non-blocking UDP requests on Macs.
So hypothesis: When you request "nonblocking" IO the kernel takes that very literally and WILL NOT block say to allocate some memory. (The "please don't block" is passed down to the memory allocation function, and that one says: "sorry, but can't help you today under that restriction". )
So a fix would be to ignore that error unless it happens too often. say: if (errno == 55) { errstatus = (errstatus * 9) / 10;errstatus += 100; if (errstatus > 500) ... pass the error on causing mtr to exit} The trouble is to do the accounting correctly. Maybe report to MTR: Sorry packet didn't get sent. so that the mtr-accounting part doesn't count this as "failure on the link to that host".
mtr has previously worked great for me, but in the last week it will fail within minutes with the error
mtr: Unexpected mtr-packet error
Example:
It is not clear to me what triggers it to fail. I packet captured, and saw no unusual ICMP/packets that would cause this. I can pretty reliability reproduce this within a minute of running the command.
I'm happy to try and debug this.