Closed jean-christophe-manciot closed 8 years ago
Hi,
It's a one time bug? Or your reproduce it?
Sporadic
Catching the loop while trying to stop a node with "debug 3"; you should notice the "Operation canceled (error: 5)":
=> MGT-Switch-11 is stopping
MGT-Switch-11 is stopping
Connection to http://127.0.0.1:8000
Connection to http://127.0.0.1:8000
GET http://127.0.0.1:8000/v1/version {}
GET http://127.0.0.1:8000/v1/version {}
Decoding response from http://AnIIj0n3Bq3DOgGy065HevlfGu4DKKxMqGG64wysNHM21pC8AtEgpAGA92VASTez@127.0.0.1:8000/v1/version response 200
Decoding response from http://AnIIj0n3Bq3DOgGy065HevlfGu4DKKxMqGG64wysNHM21pC8AtEgpAGA92VASTez@127.0.0.1:8000/v1/version response 200
{
"local": true,
"version": "1.4.6"
}
{
"local": true,
"version": "1.4.6"
}
POST http://127.0.0.1:8000/v1/projects/d3f1b32d-60af-4481-b8d7-66a875013624/dynamips/vms/fba15df4-36cf-4476-80eb-ab87c966e858/stop {}
POST http://127.0.0.1:8000/v1/projects/d3f1b32d-60af-4481-b8d7-66a875013624/dynamips/vms/fba15df4-36cf-4476-80eb-ab87c966e858/stop {}
Response error: Operation canceled (error: 5)
Response error: Operation canceled (error: 5)
Connection to http://127.0.0.1:8000 closed
Connection to http://127.0.0.1:8000 closed
error while stopping MGT-Switch-11: Operation canceled
error while stopping MGT-Switch-11: Operation canceled
Server error from http://127.0.0.1:8000: MGT-Switch-11: Operation canceled
When taking a look into the corresponding .gns3:
"properties": {
"auto_delete_disks": false,
"clock_divisor": 8,
"console": 2002,
"disk0": 1,
"disk1": 0,
"exec_area": 64,
"idlemax": 500,
"idlepc": "0x6026b804",
"idlesleep": 30,
"image": "c3725-adventerprisek9-mz.124-15.T10.image",
"image_md5sum": "1f95c093dd0da41cc28057d1c46461e0",
"iomem": 5,
"mac_addr": "c201.4f85.0000",
"mmap": true,
"name": "MGT-Switch-11",
"nvram": 256,
"platform": "c3725",
"ram": 128,
"slot0": "GT96100-FE",
"slot1": "NM-16ESW",
"sparsemem": true,
"startup_config": "configs/i1_startup-config.cfg",
"system_id": "FTX0945W0MY",
"wic0": "WIC-2T"
},
"server_id": 1,
"symbol": ":/symbols/multilayer_switch.svg",
"type": "C3725",
"vm_id": "**fba15df4-36cf-4476-80eb-ab87c966e858**",
"x": -610.5,
"y": 227.5
However, there is no such vm_id in the project-files/qemu folder:
Distributed-Open-Source-Clouds/project-files/qemu#ll
total 76
drwxr-xr-x 2 actionmystique actionmystique 4096 Apr 15 16:46 13bcab20-6269-4293-b1d5-8fb73681dd10/
drwxr-xr-x 2 actionmystique actionmystique 4096 May 2 18:41 36d860b9-ae53-451b-a51e-49c0b4c003c6/
drwxr-xr-x 2 actionmystique actionmystique 4096 May 2 17:59 3a4ff153-4f4f-443f-a550-b2696136e006/
drwxr-xr-x 2 actionmystique actionmystique 4096 May 2 18:41 3f4bf1e4-2771-4de3-a4b6-3f33fee78d1e/
drwxr-xr-x 2 actionmystique actionmystique 4096 May 5 16:42 57f5e633-7977-48a6-bfa9-5fe920dab946/
drwxr-xr-x 2 actionmystique actionmystique 4096 May 5 16:42 6a2e081a-7296-476b-89a6-b75da600726e/
drwxr-xr-x 2 actionmystique actionmystique 4096 May 2 17:59 6ce48e25-b577-4160-ab47-c5d9630c5519/
drwxr-xr-x 2 actionmystique actionmystique 4096 Apr 11 18:45 7a7be774-8246-4600-9311-d3619e6d86a7/
drwxr-xr-x 2 actionmystique actionmystique 4096 Apr 15 16:47 d83ebbf8-84dd-42d9-847e-f91a01aa8a14/
drwxr-xr-x 2 actionmystique actionmystique 4096 May 2 17:59 df00cd8a-d1dc-46a0-af3d-7b886161c8ec/
drwxr-xr-x 2 actionmystique actionmystique 4096 May 2 17:59 dfd532ec-64e7-4385-a22c-6bbb6f9d411f/
I have no idea how exactly this happened; I may have renamed the node in the topology at some point in time, but I have definitely not touched in any way its id. Maybe it has been renamed, its id automatically modified in the folder but not automatically updated in the gns3 topology file.
Actually, it is probably expected that there is no such vm id in the qemu folder, since this node is a Dynamips one. In that case, how is this id used?
Response error: Connection refused (error: 1)
Connection to http://127.0.0.1:8000 closed
error while stopping MGT-Switch-11: Connection refused
Server error from http://127.0.0.1:8000: MGT-Switch-11: Connection refused
A device is still running, please stop it before closing your project
I noticed the connection is closed before stopping the devices. Apparently we are not waiting for the server to stop all devices. @noplay can you check?
This could be resolve by implementing https://github.com/GNS3/gns3-gui/issues/1212
@jean-christophe-manciot
I need the server log, you should find it in ~/.config/GNS3/gns3_server.log
Thanks,
@noplay I think we are not waiting for tasks to complete (stop + save configs) before closing the connection to the Dynamips process.
Something is strange all lines are duplicate in the debug log:
=> MGT-Switch-11 is stopping
MGT-Switch-11 is stopping
Connection to http://127.0.0.1:8000
Connection to http://127.0.0.1:8000
GET http://127.0.0.1:8000/v1/version {}
GET http://127.0.0.1:8000/v1/version {}
Decoding response from http://AnIIj0n3Bq3DOgGy065HevlfGu4DKKxMqGG64wysNHM21pC8AtEgpAGA92VASTez@127.0.0.1:8000/v1/version response 200
Decoding response from http://AnIIj0n3Bq3DOgGy065HevlfGu4DKKxMqGG64wysNHM21pC8AtEgpAGA92VASTez@127.0.0.1:8000/v1/version response 200
{
"local": true,
"version": "1.4.6"
}
{
"local": true,
"version": "1.4.6"
}
POST http://127.0.0.1:8000/v1/projects/d3f1b32d-60af-4481-b8d7-66a875013624/dynamips/vms/fba15df4-36cf-4476-80eb-ab87c966e858/stop {}
POST http://127.0.0.1:8000/v1/projects/d3f1b32d-60af-4481-b8d7-66a875013624/dynamips/vms/fba15df4-36cf-4476-80eb-ab87c966e858/stop {}
Response error: Operation canceled (error: 5)
Response error: Operation canceled (error: 5)
Connection to http://127.0.0.1:8000 closed
Connection to http://127.0.0.1:8000 closed
error while stopping MGT-Switch-11: Operation canceled
error while stopping MGT-Switch-11: Operation canceled
Server error from http://127.0.0.1:8000: MGT-Switch-11: Operation canceled
And the first thing GNS3 is doing is reopening a connection to the server.
yes, there are errors in the log, for instance line 632+:
2016-05-06 20:26:57 INFO hypervisor.py:142 Stopping Dynamips process PID=13002
2016-05-06 20:26:57 INFO hypervisor.py:142 Stopping Dynamips process PID=15797
2016-05-06 20:26:57 INFO project.py:379 Project d3f1b32d-60af-4481-b8d7-66a875013624 with path '/media/actionmystique/SAMSUNG-Ext4-1/Labs/GNS3/CCNP-Cloud/CLDAUT/Distributed-Open-Source-Clouds' closed
2016-05-06 20:26:57 INFO run.py:243 Remove PID file /root/.config/GNS3/gns3_server.pid
2016-05-06 20:26:57 ERROR base_events.py:1134 Task was destroyed but it is pending!
source_traceback: Object created at (most recent call last):
File "/usr/local/bin/gns3server", line 9, in <module>
load_entry_point('gns3-server==1.4.6', 'console_scripts', 'gns3server')()
File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.4.6-py3.5.egg/gns3server/main.py", line 72, in main
run()
File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.4.6-py3.5.egg/gns3server/run.py", line 232, in run
server.run()
File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.4.6-py3.5.egg/gns3server/server.py", line 256, in run
self._loop.run_forever()
File "/usr/lib/python3.5/asyncio/base_events.py", line 331, in run_forever
self._run_once()
File "/usr/lib/python3.5/asyncio/base_events.py", line 1290, in _run_once
handle._run()
File "/usr/lib/python3.5/asyncio/events.py", line 125, in _run
self._callback(*self._args)
File "/usr/lib/python3/dist-packages/aiohttp/web.py", line 50, in connection_made
super().connection_made(transport)
File "/usr/lib/python3/dist-packages/aiohttp/server.py", line 147, in connection_made
self._request_handler = ensure_future(self.start(), loop=self._loop)
task: <Task cancelling coro=<ServerHttpProtocol.start() running at /usr/lib/python3/dist-packages/aiohttp/server.py:234> wait_for=<Future cancelled created at /usr/lib/python3/dist-packages/aiohttp/streams.py:431> created at /usr/lib/python3/dist-packages/aiohttp/server.py:147>
Whole log: gns3_server.log.txt
@jean-christophe-manciot can you (privately) share your project please?
Thanks,
I've pushed a fix. This will have to be tested.
This should be fixed in 1.5.0 beta 1. Please let us know if you still have issues.
[Linux/1.5.1] Still some sporadic infinite loops when trying to stop a dynamips node or close a project. An example of the latter is at the end of the log: "Task was destroyed but it is pending!" gns3_server.log.txt .
Some errors in the dynamips node log:
Aug 11 17:40:59.067 ROM: unhandled syscall 0x0000003e at pc=0x60c00c34 (a1=0x80007de4,a2=0x639511d4,a3=0x000000f8)
Aug 11 17:40:59.067 ROM: unhandled syscall 0x00000047 at pc=0x60c00c34 (a1=0x80007dec,a2=0x639511f8,a3=0x0000011c)
Aug 11 17:41:00.621 ROM: trying to read bootvar 'BOOT'
Aug 11 17:41:00.621 ROM: trying to read bootvar 'CONFIG_FILE'
Aug 11 17:41:00.621 ROM: trying to read bootvar 'BOOTLDR'
Aug 11 17:41:00.621 ROM: trying to read bootvar 'RSHELF'
Aug 11 17:41:00.621 ROM: trying to read bootvar 'DSHELF'
Aug 11 17:41:00.621 ROM: trying to read bootvar 'DSHELFINFO'
Aug 11 17:41:00.621 ROM: trying to read bootvar 'RESET_COUNTER'
Aug 11 17:41:00.621 ROM: trying to read bootvar 'CHRG_LOCRECSN'
Aug 11 17:41:00.622 ROM: trying to read bootvar 'CHRG_ID'
Aug 11 17:41:00.622 ROM: trying to read bootvar 'SLOTCACHE'
Aug 11 17:41:00.622 ROM: trying to read bootvar 'OVERTEMP'
Aug 11 17:41:00.622 ROM: trying to read bootvar 'DIAG'
Aug 11 17:41:00.622 ROM: trying to read bootvar 'DIAMETER_ORIGIN_ID'
Aug 11 17:41:00.622 ROM: trying to read bootvar 'WARM_REBOOT'
Aug 11 17:41:00.776 CPU0: IO_FPGA: read from unknown addr 0x16, pc=0x60916cac (size=2)
Aug 11 17:41:00.776 CPU0: MTS: read access to undefined address 0x3c080022 at pc=0x6091689c (size=1)
Aug 11 17:41:00.776 CPU0: MTS: read access to undefined address 0x3c080023 at pc=0x60916924 (size=1)
Aug 11 17:41:00.776 CPU0: MTS: write access to undefined address 0x3c080023 at pc=0x60916930, value=0x00000000 (size=1)
Aug 11 17:41:00.776 CPU0: MTS: read access to undefined address 0x3c080023 at pc=0x6091693c (size=1)
Aug 11 17:41:00.776 CPU0: MTS: write access to undefined address 0x3c080023 at pc=0x6091694c, value=0x00000080 (size=1)
Aug 11 17:41:00.777 CPU0: MTS: read access to undefined address 0x3c000002 at pc=0x60918734 (size=1)
Aug 11 17:41:00.777 CPU0: MTS: write access to undefined address 0x3c000002 at pc=0x60918740, value=0x00000080 (size=1)
...
Alongside the following strange message in the console:
System returned to ROM by unknown reload cause - suspect boot_data[BOOT_COUNT] 0x0, BOOT_COUNT 0, BOOTDATA 19
System image file is "tftp://255.255.255.255/unknown"
Another type of log when things go south for the Dynamips node when trying to stop it leading to the operation being automatically cancelled
Aug 12 14:44:06.793 CPU0: IO_FPGA: read from unknown addr 0x16, pc=0x60916c6c (size=2)
Aug 12 14:44:06.793 CPU0: IO_FPGA: write to unknown addr 0x16, value=0x1, pc=0x60916c74 (size=2)
Aug 12 14:44:06.829 ROM: trying to read bootvar 'ROM_PERSISTENT_UTC'
Aug 12 14:44:06.957 CPU0: JIT: flushing data structures (compiled pages=1095)
Aug 12 14:44:07.504 ROM: trying to set bootvar 'BSI=0'
Aug 12 14:44:07.504 ROM: trying to read bootvar 'RET_2_RCALTS'
Aug 12 14:44:07.504 ROM: trying to set bootvar 'RET_2_RCALTS='
Aug 12 14:44:07.575 CPU0: JIT: partial JIT flush (count=794)
Aug 12 14:44:07.584 ROM: trying to read bootvar 'RANDOM_NUM'
Aug 12 14:44:16.083 VTTY: Console port is now connected (accept_fd=19,conn_fd=25)
Aug 12 14:44:56.935 CPU0: JIT: flushing data structures (compiled pages=1105)
Aug 12 14:46:43.082 ROM: trying to set bootvar 'RANDOM_NUM=830978794'
Aug 12 18:35:56.880 C3725_STOP: stopping simulation.
Aug 12 18:35:56.880 CPU0: CPU_STATE: Halting CPU (old state=0)...
Aug 12 18:35:56.932 DEVICE: Removal of device WIC-2T(0), fd=-1, host_addr=0x0, flags=0
Then GNS3 loops for a very long time when trying to close the project leading to the type of error described in this previous post with slightly different messages:
2016-08-12 18:43:06 INFO hypervisor.py:150 Stopping Dynamips process PID=31822
2016-08-12 18:45:08 WARNING server.py:110 Server has got signal SIGINT, exiting...
2016-08-12 18:45:08 ERROR route.py:194 Request canceled
2016-08-12 18:45:08 INFO run.py:266 Remove PID file /root/.config/GNS3/gns3_server.pid
2016-08-12 18:45:08 ERROR base_events.py:1148 Task was destroyed but it is pending!
source_traceback: Object created at (most recent call last):
File "/usr/local/bin/gns3server", line 9, in <module>
load_entry_point('gns3-server==1.5.1', 'console_scripts', 'gns3server')()
File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.5.1-py3.5.egg/gns3server/main.py", line 72, in main
run()
File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.5.1-py3.5.egg/gns3server/run.py", line 255, in run
server.run()
File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.5.1-py3.5.egg/gns3server/server.py", line 256, in run
self._loop.run_forever()
File "/usr/lib/python3.5/asyncio/base_events.py", line 345, in run_forever
self._run_once()
File "/usr/lib/python3.5/asyncio/base_events.py", line 1304, in _run_once
handle._run()
File "/usr/lib/python3.5/asyncio/events.py", line 125, in _run
self._callback(*self._args)
File "/usr/lib/python3.5/asyncio/tasks.py", line 307, in _wakeup
self._step()
File "/usr/lib/python3.5/asyncio/tasks.py", line 239, in _step
result = coro.send(None)
File "/usr/local/lib/python3.5/dist-packages/aiohttp/server.py", line 266, in start
yield from self.handle_request(message, payload)
File "/usr/local/lib/python3.5/dist-packages/aiohttp/web.py", line 90, in handle_request
resp = yield from handler(request)
File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.5.1-py3.5.egg/gns3server/web/route.py", line 239, in vm_concurrency
response = yield from control_schema(request)
File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.5.1-py3.5.egg/gns3server/web/route.py", line 179, in control_schema
yield from func(request, response)
File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.5.1-py3.5.egg/gns3server/handlers/api/dynamips_vm_handler.py", line 167, in start
yield from vm.start()
File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.5.1-py3.5.egg/gns3server/modules/dynamips/nodes/router.py", line 249, in start
monitor_process(self._hypervisor.process, self._termination_callback)
File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.5.1-py3.5.egg/gns3server/utils/asyncio/__init__.py", line 101, in monitor_process
asyncio.async(_check_process(process, termination_callback))
File "/usr/lib/python3.5/asyncio/tasks.py", line 532, in async
return ensure_future(coro_or_future, loop=loop)
task: <Task pending coro=<_check_process() running at /usr/local/lib/python3.5/dist-packages/gns3_server-1.5.1-py3.5.egg/gns3server/utils/asyncio/__init__.py:91> wait_for=<Future cancelled created at /usr/lib/python3.5/asyncio/base_events.py:252> created at /usr/lib/python3.5/asyncio/tasks.py:532>
... repeated
Hi, in the GNS3 when added router is view Server error from 127.0.0.1:8000: R1: Lost communication with 127.0.0.1:49788 :[WinError 64] The specified network name is no longer available, Dynamips process running: True ,,, please help me I do not understand of this Problem
Trying to stop all nodes with the "stop all devices" red button:
"MGT-Switch-11" is a 3725 dynamips node. From c3725_i1_log.txt:
A few errors also appeared in the log before stopping it; they may be unrelated to the current issue. c3725_i1_log.txt
There's no issue if I stop the same node from the menu floating above it.