robotpy / pyfrc

python3 library designed to make developing RobotPy-based code easier!
MIT License
50 stars 35 forks source link

SFTP hangs when attempting to deploy on OSX 10.11 #35

Closed theopolisme closed 8 years ago

theopolisme commented 8 years ago

Hi, I'm on Team 5045 and we used (and really enjoyed!) RobotPy last year. I upgraded my Mac to 10.11 (possibly related?) a week or so ago, and suddenly I have started running into an error when attempting to deploy code to the robot.

$ python3 robot.py deploy --skip-tests
15:30:26:383 INFO    : wpilib              : WPILib version 2015.0.15
15:30:26:384 INFO    : wpilib              : HAL base version 2015.0.15; sim platform version 2015.0.15
Deploying to robot at 10.50.45.2
NI Linux Real-Time (run mode)

Log in with your NI-Auth credentials.

WPILib version on robot is 2015.0.15
NI Linux Real-Time (run mode)

Log in with your NI-Auth credentials.

sftp> mkdir "/home/lvuser/py"
sftp> put -r "/var/folders/02/b5qrvn193_33457rk8dh0b9r0000gn/T/tmpc6w1yugj/py" "/home/lvuser"
Entering /var/folders/02/b5qrvn193_33457rk8dh0b9r0000gn/T/tmpc6w1yugj/py/
hm 1
ERROR: Command ['/usr/bin/sftp', '-oBatchMode=no', '-oStrictHostKeyChecking=no', '-oUserKnownHostsFile=/dev/null', '-b', '/var/folders/02/b5qrvn193_33457rk8dh0b9r0000gn/T/tmpw6kumwwm', 'lvuser@10.50.45.2'] returned non-zero error status 1

I tried to figure out what was going on but really couldn't seem to find anything. I can ssh into to the robot just fine. I added something logging to the ssh_exec_pass method (printing after each data = _read(pty_fd)):

$ python3 robot.py deploy --skip-tests
15:39:52:883 INFO    : wpilib              : WPILib version 2015.0.15
15:39:52:883 INFO    : wpilib              : HAL base version 2015.0.15; sim platform version 2015.0.15
Deploying to robot at 10.50.45.2
1 None
2 None
3 b"Warning: Permanently added '10.50.45.2' (ECDSA) to the list of known hosts.\r\n"
1 None
2 None
3 b'NI Linux Real-Time (run mode)\n\nLog in with your NI-Auth credentials.\n\n'
NI Linux Real-Time (run mode)

Log in with your NI-Auth credentials.

1 None
2 b'WPILib version on robot is 2015.0.15\n'
WPILib version on robot is 2015.0.15
3 None
1 b''
2 b''
3 b''
1 None
2 None
3 b"Warning: Permanently added '10.50.45.2' (ECDSA) to the list of known hosts.\r\n"
1 None
2 None
3 b'NI Linux Real-Time (run mode)\n\nLog in with your NI-Auth credentials.\n\n'
NI Linux Real-Time (run mode)

Log in with your NI-Auth credentials.

1 None
2 b'sftp> mkdir "/home/lvuser/py"\n'
sftp> mkdir "/home/lvuser/py"
3 None
1 None
2 b'sftp> put -r "/var/folders/02/b5qrvn193_33457rk8dh0b9r0000gn/T/tmpdquo6x2f/py" "/home/lvuser"\n'
sftp> put -r "/var/folders/02/b5qrvn193_33457rk8dh0b9r0000gn/T/tmpdquo6x2f/py" "/home/lvuser"
3 None
1 None
2 b'Entering /var/folders/02/b5qrvn193_33457rk8dh0b9r0000gn/T/tmpdquo6x2f/py/\n'
Entering /var/folders/02/b5qrvn193_33457rk8dh0b9r0000gn/T/tmpdquo6x2f/py/
3 None
1 None
2 b''
3 b''
1 b''
2 None
3 None
sftp 1 bytearray(b'')
hm 1
ERROR: Command ['/usr/bin/sftp', '-oBatchMode=no', '-oStrictHostKeyChecking=no', '-oUserKnownHostsFile=/dev/null', '-b', '/var/folders/02/b5qrvn193_33457rk8dh0b9r0000gn/T/tmpdgwnjzio', 'lvuser@10.50.45.2'] returned non-zero error status 1

It looks like somehow the sftp operation is hanging after entering the directory on the local machine? I don't know how to resolve this, but wondering if someone else on OSX 10.11 may able to replicate? Maybe a protocol mismatch or something between what's installed on the RoboRIO vs my machine?

Any ideas? We're in a tough spot as we're scheduled for a demo on Friday and right now the robot isn't operational. Any help would be much appreciated.

Cheers, Theo & Team 5045

virtuald commented 8 years ago

The upgrade is almost certainly related. I haven't upgraded to 10.11 yet... I'm sure I will soonish, but I won't have access to a RoboRIO in time to diagnose/solve your problem.

One option you have, of course, is using the manual install/run steps from the robotpy documentation. It's not super convenient, but it works.

That option only works while you're ssh'ed into the robot, and isn't permanent. If you read cli_deploy.py, you can perform the same steps it does to deploy your code permanently. Here's what you need to do if you've deployed to that robot before:

... and reset your robot, and that should do it. Certainly not as convenient, but you won't be dead in the water for your demo.

For debugging purposes, I would be interested in knowing the output of the following if you insert it at line 427 of installer.py:

with open(bfname, 'r') as fp:
    print(fp.read())

I would also be interested in knowing what the output of the help command for 'put' is if you log in to the robot manually using sftp.

I also wonder if perhaps the temporary directory (/var/folders/...) actually has anything in it, or if there's something strange about creating temporary directories on 10.11.

theopolisme commented 8 years ago

Thanks for the quick & detailed response! I really appreciate it. I'll check this stuff out tomorrow when I'm back with the robot and will get back to you tomorrow evening.

Theo

computer-whisperer commented 8 years ago

I also noticed this issue recently when I updated my Arch Linux install.

theopolisme commented 8 years ago

For now I just did an scp-based deploy. I'll check out those debugging suggestions of yours soon, just didn't have time today.

virtuald commented 8 years ago

FYI: I just fired up a RoboRIO tonight, and found that I have the same issue on Fedora 22. Interestingly enough, the recursive put doesn't seem to work at all, even when executing it manually.

theopolisme commented 8 years ago

I've temporarily switched the sftp function in installer.py to use scp (https://github.com/theopolisme/pyfrc/commit/1d5c516b75c92a54dd47f48930765677441c50e8) to make auto-deploys work again until this is figured out...

virtuald commented 8 years ago

I will definitely get this fixed by kickoff, but I don't have time at the moment.

virtuald commented 8 years ago

FYI, working on this now, but found there's a bug filed about this behavior of OpenSSH at https://bugzilla.mindrot.org/show_bug.cgi?id=2150

virtuald commented 8 years ago

I give up, I couldn't get put -r to work at all, even with a lot of weird permutations. I just ended up copying the files individually instead. Let me know if this fixes the problem for you.

theopolisme commented 8 years ago

Thanks Dustin! We're in the midst of exams, but I'll be able to give this a shot probably next week once I can get access to the school.