SeattleTestbed / nodemanager

Remote control server for SeattleTestbed nodes
MIT License
0 stars 10 forks source link

Fix SeattleTestbed/nodemanager#115, daemon.py hangs #116

Closed aaaaalbert closed 9 years ago

aaaaalbert commented 9 years ago

See the detailed discussion on the issue's page for details, SeattleTestbed/nodemanager#115.

The TL;DR is that the init process, parent of all processes whose parents have died, might not have process ID 1, which our code previously required to work correctly. With this fix, the restriction is lifted.

vladimir-v-diaz commented 9 years ago

I tested this pull request on Ubuntu 14.04 LTS and Python 2.7.6. The unit test failures I was previously encountering are not occurring, but one unit test is failing: ut_nm_timeout.r2py. ut_nm_fastclient.r2py is also failing, but that is expected ATM. Other than this, the pull request looks good.

Here is the test failure when I run python utf.py -a:

Running: ut_nm_timeout.r2py                                 [ FAIL ]
--------------------------------------------------------------------------------
Standard error :
..............................Produced..............................
---
Uncaught exception!
---
Following is a full traceback, and a user traceback.
The user traceback excludes non-user modules. The most recent call is displayed last.

Full debugging traceback:
  "repy.py", line 154, in execute_namespace_until_completion
  "/home/vlad/projects/seattletestbed/nodemanager/RUNNABLE/virtual_namespace.py", line 117, in evaluate
  "/home/vlad/projects/seattletestbed/nodemanager/RUNNABLE/safe.py", line 588, in safe_run
  "dylink.r2py", line 546, in <module>
  "dylink.r2py", line 407, in dylink_dispatch
  "dylink.r2py", line 520, in evaluate
  "/home/vlad/projects/seattletestbed/nodemanager/RUNNABLE/virtual_namespace.py", line 117, in evaluate
  "/home/vlad/projects/seattletestbed/nodemanager/RUNNABLE/safe.py", line 588, in safe_run
  "ut_nm_timeout.r2py", line 42, in <module>
  "nmclient.r2py", line 298, in nmclient_createhandle

User traceback:
  "dylink.r2py", line 546, in <module>
  "dylink.r2py", line 407, in dylink_dispatch
  "dylink.r2py", line 520, in evaluate
  "ut_nm_timeout.r2py", line 42, in <module>
  "nmclient.r2py", line 298, in nmclient_createhandle

Exception (with class '.NMClientException'): RepyArgumentError("Provided destip is not valid! IP: '68d8951cfe038eb10ef5ad8d44e7d22820f878a5'",)
---

..............................Expected..............................
None
--------------------------------------------------------------------------------

I get this different failure message if ut_nm_timeout.r2py is tested with python utf.py -f:

Running: ut_nm_timeout.r2py                                 [ FAIL ]
--------------------------------------------------------------------------------
Standard error :
..............................Produced..............................
---
Uncaught exception!
---
Following is a full traceback, and a user traceback.
The user traceback excludes non-user modules. The most recent call is displayed last.

Full debugging traceback:
  "repy.py", line 154, in execute_namespace_until_completion
  "/home/vlad/projects/seattletestbed/nodemanager/RUNNABLE/virtual_namespace.py", line 117, in evaluate
  "/home/vlad/projects/seattletestbed/nodemanager/RUNNABLE/safe.py", line 588, in safe_run
  "dylink.r2py", line 546, in <module>
  "dylink.r2py", line 407, in dylink_dispatch
  "dylink.r2py", line 520, in evaluate
  "/home/vlad/projects/seattletestbed/nodemanager/RUNNABLE/virtual_namespace.py", line 117, in evaluate
  "/home/vlad/projects/seattletestbed/nodemanager/RUNNABLE/safe.py", line 588, in safe_run
  "ut_nm_timeout.r2py", line 55, in <module>
  "nmclient.r2py", line 414, in nmclient_listaccessiblevessels
  "nmclient.r2py", line 435, in nmclient_getvesseldict
  "nmclient.r2py", line 352, in nmclient_rawsay
  "nmclient.r2py", line 154, in nmclient_rawcommunicate

User traceback:
  "dylink.r2py", line 546, in <module>
  "dylink.r2py", line 407, in dylink_dispatch
  "dylink.r2py", line 520, in evaluate
  "ut_nm_timeout.r2py", line 55, in <module>
  "nmclient.r2py", line 414, in nmclient_listaccessiblevessels
  "nmclient.r2py", line 435, in nmclient_getvesseldict
  "nmclient.r2py", line 352, in nmclient_rawsay
  "nmclient.r2py", line 154, in nmclient_rawcommunicate

Exception (with class '.NMClientException'): recv() timed out!!
---

..............................Expected..............................
None
--------------------------------------------------------------------------------
aaaaalbert commented 9 years ago

Tested on Windows 7 SP1, Win functionality (a NOP) isn't affected by the patch.

Also tested on Mac OS X 10.6.8, runs correctly (as it did before the patch).

Merging!