Closed jousley closed 7 years ago
I tried some simple things (mostly by adding unicode values to keys and strings) to reproduce your error and wasn't able to do so. If you can reproduce this, please try to figure out what values are in NetworkTables at the time -- maybe via a screenshot of OutlineViewer, or something... look for weird non-ascii values. Another thing you can do is edit /usr/local/lib/python2.7/dist-packages/ntcore/wire.py
and catch that exception, and when it occurs then print out the string that's causing the issue.
Any progress here?
I have come across this issue as well with no solution yet
I'm not currently able to reproduce this bug. If you're able to reproduce it reliably and provide a way for me to do so, I can fix it. Otherwise, upgrade to Python 3 and I suspect the problem will disappear.
Two more reports of this, both on python 3:
DEBUG:nt:client connected
DEBUG:nt:NetworkConnection stopping (<ntcore.network_connection.NetworkConnection object at 0x712411b0>)
ERROR:nt:Unhandled exception during handshake
Traceback (most recent call last):
File "/home/pi/.virtualenvs/cv/lib/python3.4/site-packages/ntcore/network_connection.py", line 240, in _readThreadMain
handshake_success = self.m_handshake(self, _getMessage, self._sendMessages)
File "/home/pi/.virtualenvs/cv/lib/python3.4/site-packages/ntcore/dispatcher.py", line 488, in _clientHandshake
msg = get_msg()
File "/home/pi/.virtualenvs/cv/lib/python3.4/site-packages/ntcore/network_connection.py", line 228, in _getMessage
return Message.read(self.m_stream, decoder, self.m_get_entry_type)
File "/home/pi/.virtualenvs/cv/lib/python3.4/site-packages/ntcore/message.py", line 123, in read
value = codec.read_value(value_type, rstream)
File "/home/pi/.virtualenvs/cv/lib/python3.4/site-packages/ntcore/wire.py", line 126, in read_value
return Value.makeStringArray([self.read_string(rstream) for _ in range(alen)])
File "/home/pi/.virtualenvs/cv/lib/python3.4/site-packages/ntcore/wire.py", line 126, in <listcomp>
return Value.makeStringArray([self.read_string(rstream) for _ in range(alen)])
File "/home/pi/.virtualenvs/cv/lib/python3.4/site-packages/ntcore/wire.py", line 198, in read_string_v3
return rstream.read(slen).decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf0 in position 47: invalid continuation byte
INFO:nt:DISCONNECTED 10.0.66.2 port 1735 (Robot)
DEBUG:nt:write thread died (<ntcore.network_connection.NetworkConnection object at 0x70088430>)
16:21:39:014 ERROR : nt : Unhandled exception during handshake
Traceback (most recent call last):
File "/usr/local/var/pyenv/versions/dashboard/lib/python3.6/site-packages/ntcore/network_connection.py", line 240, in _readThreadMain
handshake_success = self.m_handshake(self, _getMessage, self._sendMessages)
File "/usr/local/var/pyenv/versions/dashboard/lib/python3.6/site-packages/ntcore/dispatcher.py", line 488, in _clientHandshake
msg = get_msg()
File "/usr/local/var/pyenv/versions/dashboard/lib/python3.6/site-packages/ntcore/network_connection.py", line 228, in _getMessage
return Message.read(self.m_stream, decoder, self.m_get_entry_type)
File "/usr/local/var/pyenv/versions/dashboard/lib/python3.6/site-packages/ntcore/message.py", line 123, in read
value = codec.read_value(value_type, rstream)
File "/usr/local/var/pyenv/versions/dashboard/lib/python3.6/site-packages/ntcore/wire.py", line 126, in read_value
return Value.makeStringArray([self.read_string(rstream) for _ in range(alen)])
File "/usr/local/var/pyenv/versions/dashboard/lib/python3.6/site-packages/ntcore/wire.py", line 126, in <listcomp>
return Value.makeStringArray([self.read_string(rstream) for _ in range(alen)])
File "/usr/local/var/pyenv/versions/dashboard/lib/python3.6/site-packages/ntcore/wire.py", line 198, in read_string_v3
return rstream.read(slen).decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbd in position 89: invalid start byte
16:21:39:014 INFO : nt : DISCONNECTED 10.24.3.2 port 1735 (Robot)
There's definitely an issue, but I need more details and need to be able to reproduce this otherwise I can't help you fix this issue:
Also, if you are able to reproduce this reliably, something that can help me diagnose this is adding the following code to the top of your main py file where logging is initialized:
https://gist.github.com/virtuald/65eed85ac579000eec14a40f41f47287
Sorry, but we just reverted the code to an earlier version, because we were in a hurry and lost the copy that produced the bug.
So it was something in your code?
I'm curious if this is random, or if something specific in code is causing it.
We don't know
That second example above from an hour ago was us. We're using a custom dashboard running off pynetworktables2js, so it's not likely an issue with our code. And it's also not an instance issue since we were receiving the same issue on multiple computers (one on macOS and one on Windows 7). It seems to be something related to the FMS or the new router firmware, since everything worked fine before the router was flashed with competition firmware. The firmware flash also caused issues with networking with our pis, but that is likely unrelated.
Hope we can figure this out!!
It's possible you're trying to add a weird character to nt, and it's not having it. Anything like that TPG?
Hm, perhaps the packets are getting corrupted somehow by the 2017 router (though, then why isn't ntcore crashing.. maybe it's not trying to encode/decode the characters?). It would be useful to look at OutlineViewer and see if there is any gibberish in that output.
Do you have a 2016 router -- those are legal to use.
@virtuald we should've had problems then too...Sounds like a really, really far edge case
I've looked through our robot and pi code and don't see anything besides floats and alphabetical strings being sent over network tables. So unless smartdashboard (which we keep backcompatabilty with because our dashboard isn't working right now) is doing something weird, we aren't throwing random Unicode in our network tables.
We're currently using the 2016 router with the competition firmware, although we got the same symptoms on the 2017 router.
I just pushed a package to pypi -- 2017.0.7a1 ... it tells python to ignore the bad unicode characters when it sees them. I haven't tried it much, but that may fix the issue for now. I would like to know why the error is occurring though, so if this addresses it if you could take a screenshot of OutlineViewer or something of any weird characters that could be useful.
could you make the update print out the bad string, or no?
Wouldn't that cause display errors on Windows? I know the windows command prompt doesnt like displaying Unicode , at least in my experience...
If this fixes it, then we can talk about creating ways to diagnose it further.
Ok, I'll update and try again in the morning :+1:
Same error on the updated PyNetworkTables.
02:54:39:940 DEBUG : nt : NetworkConnection stopping (<ntcore.network_connection.NetworkConnection object at 0x7125e090>)
02:54:39:950 ERROR : nt : Unhandled exception during handshake
Traceback (most recent call last):
File "/home/pi/.virtualenvs/cv/lib/python3.4/site-packages/ntcore/network_connection.py", line 240, in _readThreadMain
handshake_success = self.m_handshake(self, _getMessage, self._sendMessages)
File "/home/pi/.virtualenvs/cv/lib/python3.4/site-packages/ntcore/dispatcher.py", line 488, in _clientHandshake
msg = get_msg()
File "/home/pi/.virtualenvs/cv/lib/python3.4/site-packages/ntcore/network_connection.py", line 228, in _getMessage
return Message.read(self.m_stream, decoder, self.m_get_entry_type)
File "/home/pi/.virtualenvs/cv/lib/python3.4/site-packages/ntcore/message.py", line 123, in read
value = codec.read_value(value_type, rstream)
File "/home/pi/.virtualenvs/cv/lib/python3.4/site-packages/ntcore/wire.py", line 130, in read_value
return Value.makeStringArray([self.read_string(rstream) for _ in range(alen)])
File "/home/pi/.virtualenvs/cv/lib/python3.4/site-packages/ntcore/wire.py", line 130, in <listcomp>
return Value.makeStringArray([self.read_string(rstream) for _ in range(alen)])
File "/home/pi/.virtualenvs/cv/lib/python3.4/site-packages/ntcore/wire.py", line 206, in read_string_v3
return rstream.read(slen).decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf0 in position 68: invalid continuation byte
Locals at innermost frame:
{ 'rstream': <ntcore.tcpsockets.tcp_stream.TCPStream object at 0x6eed2410>,
'self': <ntcore.wire.WireCodec object at 0x7124ae90>,
'slen': 117}
02:54:39:952 INFO : nt : DISCONNECTED 10.0.66.2 port 1735 (Robot)
02:54:39:953 DEBUG : nt : write thread died (<ntcore.network_connection.NetworkConnection object at 0x6eed24f0>)
Here's some information on our setup. RoboRIO static IP set to 10.0.66.2, Bridge is configured for comp mode. Raspberry Pi is static IP set to 10.0.66.12. IP Camera is set to 10.0.66.11. We do have a second USB camera but that's only streaming to smart dashboard. The error happens regardless of SmartDash being opened or robot being enabled. It also happens to my second Pi with the same setup.
Is it only happening to people using rPIs?
@Daltz333 you aren't using the updated version -- read_string_v3 is at line 196: https://github.com/robotpy/pynetworktables/blob/unicode-fix/ntcore/wire.py#L196
You may want to do pip install -U pynetworktables --pre
or pip install pynetworktables==2017.0.7a1
@ThePlasmaGuy did it work for you?
@virtuald Here's a stackexchange post I made (before the update). I will update tomorrow and get back to you. http://robotics.stackexchange.com/questions/11840/first-robotics-competition-pynetworktables-nt-thread-died?noredirect=1#comment21072_11840
@virtuald We forgot to test at comp due to the craziness of the last day of competition. However, I did preserve the entire control system out of bag so I can test with the new pynetworktables before we flash the radio for practice.
Any more progress here?
We have yet to have time to set up the test bed due to the fact that our school is on break this week. I'm planning on setting up the test bed with the problematic radio, Rio, and pis when I can get access to the hardware on Monday and Tuesday, so I'll get back to you after that.
I was able to somehow (I don't know how I did it) able to reproduce the error on my team's test robot. I updated it to the new pynetworktables and the error disappeared. But since I don't know how I reproduced it I can't guarantee that the update did indeed solve my problem.
Heh. If someone is able to reproduce it, I have a branch locally with code that should be able to record the network stream.... I'm tired though, so I'll push it later.
If you are able to reproduce this error locally, please install the branch at https://github.com/robotpy/pynetworktables/tree/handshake-debug ... basically, you have to:
git clone https://github.com/robotpy/pynetworktables
cd pynetworktables
git checkout handshake-debug
python setup.py sdist
pip3 install -U pynetworktables*.gz
If you do it correctly, then when it crashes there should be a file called 'file.bin' in the directory you launched the code from. Send me an email with that file.
Update: Things seem to be working right now. I'm running the most recent beta pynetworktables, but not even getting any logs so I don't know. However, I'm currently running off a little testbed setup without our raspberry pis, so idk if that's the thing causing the issue. I'll try setting that up and testing with that variable as well in the coming days, once we get our practice bot up and operational with the competition hardware.
Information I found at out competition is this error only occurs when the Rio is functiobing as the server. If the server is the driver station then it works fine and publishes data. We use a Jetson TX1. Also the Rio can not connect to another device as a client. Our robot is still flashed from competition so I will use the branch this weekend.
Another thing is we use python 2.7 and it was working at home just fine.
Is this reproduceable in the pits with a field-configured radio, or only on the field? Maybe I can borrow a radio for a bit to try and reproduce this.
It is reproducible in the pit
@denchief1 are you running the most recent version of pynetworktables? (The betas posted above) Is the TX1 the only device on the network? For us, the most recent beta seemed to fix things, although we didn't have our pis on the network at the time.
Are you getting the unicode error when trying to connect to the rio, or is it just not able to post? We found that our raspberry pis were unable to connect to the rio when they were running off a separate radio port from the rio, and they worked fine when we ran the rio and the pis off a switch off a single radio port.
We are running the betas. I no longer get the unicorn error however the jetson can still not connet. The patch just seems to suppress the error. Our network is the Rio running to the first radio and then a switch running from the radio. The Jeton is plugged into the switch.
I would try running the switch off the primary port and plugging the roborio into another port on the switch. That's legal as of last week's game update and it's allowed things to work with our raspberry pis for us. We were still having custom dashboard issues, but the beta has seemed to fix that. The two router ports apparently use different protocols and that has caused issues with certain devices iirc.
If you're running the most recent beta that @virtuald posted a couple days ago and its actually still causing errors, then it should be generating a log file that you should send him.
Is that the branch? Or is it the pynetworktables pypi file?
I will try the plugging of the Rio into the switch during our unbag time. (We are in Michigan)
The version with the log file is the one you have to build yourself off the branch:
If you are able to reproduce this error locally, please install the branch at https://github.com/robotpy/pynetworktables/tree/handshake-debug ... basically, you have to: git clone https://github.com/robotpy/pynetworktables cd pynetworktables git checkout handshake-debug python setup.py sdist copy dist/pynetworktables.gz to your system pip3 install -U pynetworktables.gz If you do it correctly, then when it crashes there should be a file called 'file.bin' in the directory you launched the code from. Send me an email with that file.
Yeah, We tested the switch during week 2 and it seemed to work fine. Our particular switch was giving us a few comms issues, but from talking to other teams who use that "roborio into switch" setup (The Highrollers (987) in particular), that's just an issue with our switch.
If you're running into any connectivity issues, try to switch everything to fixed IPs first (10.TE.AM.x with netmask 255.0.0.0). The robot should be .2, the DS .5, and everything else arbitrary above .5.
Everything has to work pretty much perfectly for mDNS to work.
Peter
We are running static ips. I will try the branch build this week.
Connectivity issues are sometimes happening over the second radio port regardless of static IP addresses, etc simply because of the different protocols used on both ports.
@denchief1 If you can reproduce it and get me that logfile, I would be very happy to see it.
We are having this issue to at competition running on our TX1. It's been an on and off issue for us. I tried installing the beta as described above by @virtuald but I'm not sure it's working correctly. I get a permission denied when trying to install the old. I ran with sudo and got other warning but no permission errors.
We are running everything static and Python 2.7
Update from yesterday:
We hooked the competition radio, rio, and both raspberry pis up to our practice bot so it was in the same state as it was at competition 2 weeks ago. While I wasn't getting the issue on my test bed with only the rio and radio hooked up, adding the raspberry pis caused the issue to return once more. Because the raspberry pi's were most definitely triggering the issue, we went into the CV code we were running on the pis and checked all of the network tables code to see if something could be triggering the unicode nt error. We noticed that we were using .putNumber to send string values over our vision table, and in case this was the issue, we switched those functions to use .putValue instead. Since changing this function over, I haven't been able to get the unicode nt error.
I'm not sure if it's related, but switching our .putNumber functions to use .putValue instead has seemed to fix things.
When I get back to the shop tomorrow, I'll try switching that back and using the beta version to try to generate some of those log files. (I didn't realize until afterwards that my local pyenv virtual environment was causing me to use the pypi version instead of the beta version when I ran programs in my pynetworktables2js-based Dashboard folder...)
UnicodeDecodeError trying to connect to Network Tables on roboRIO (Java) from Raspberry Pi with Microsoft Lifecam plugged into roboRIO.