GNS3 / gns3-server

GNS3 server
GNU General Public License v3.0
812 stars 263 forks source link

Server error from http://127.0.0.1:8000: Connection refused when stopping all the nodes #488

Closed jean-christophe-manciot closed 8 years ago

jean-christophe-manciot commented 8 years ago

Trying to stop all nodes with the "stop all devices" red button:

Response error: Connection refused (error: 1)
Connection to http://127.0.0.1:8000 closed
error while stopping MGT-Switch-11: Connection refused
Server error from http://127.0.0.1:8000: MGT-Switch-11: Connection refused
A device is still running, please stop it before closing your project

"MGT-Switch-11" is a 3725 dynamips node. From c3725_i1_log.txt:

Apr 12 19:02:23.394 C3725_STOP: stopping simulation.
Apr 12 19:02:23.394 CPU0: CPU_STATE: Halting CPU (old state=0)...
Apr 12 19:02:23.444 DEVICE: Removal of device WIC-2T(0), fd=-1, host_addr=0x0, flags=0

A few errors also appeared in the log before stopping it; they may be unrelated to the current issue. c3725_i1_log.txt

There's no issue if I stop the same node from the menu floating above it.

julien-duponchelle commented 8 years ago

Hi,

It's a one time bug? Or your reproduce it?

jean-christophe-manciot commented 8 years ago

Sporadic

jean-christophe-manciot commented 8 years ago

Catching the loop while trying to stop a node with "debug 3"; you should notice the "Operation canceled (error: 5)":

=> MGT-Switch-11 is stopping
MGT-Switch-11 is stopping
Connection to http://127.0.0.1:8000
Connection to http://127.0.0.1:8000
GET http://127.0.0.1:8000/v1/version {}
GET http://127.0.0.1:8000/v1/version {}
Decoding response from http://AnIIj0n3Bq3DOgGy065HevlfGu4DKKxMqGG64wysNHM21pC8AtEgpAGA92VASTez@127.0.0.1:8000/v1/version response 200
Decoding response from http://AnIIj0n3Bq3DOgGy065HevlfGu4DKKxMqGG64wysNHM21pC8AtEgpAGA92VASTez@127.0.0.1:8000/v1/version response 200
{
    "local": true,
    "version": "1.4.6"
}
{
    "local": true,
    "version": "1.4.6"
}
POST http://127.0.0.1:8000/v1/projects/d3f1b32d-60af-4481-b8d7-66a875013624/dynamips/vms/fba15df4-36cf-4476-80eb-ab87c966e858/stop {}
POST http://127.0.0.1:8000/v1/projects/d3f1b32d-60af-4481-b8d7-66a875013624/dynamips/vms/fba15df4-36cf-4476-80eb-ab87c966e858/stop {}
Response error: Operation canceled (error: 5)
Response error: Operation canceled (error: 5)
Connection to http://127.0.0.1:8000 closed
Connection to http://127.0.0.1:8000 closed
error while stopping MGT-Switch-11: Operation canceled
error while stopping MGT-Switch-11: Operation canceled
Server error from http://127.0.0.1:8000: MGT-Switch-11: Operation canceled

When taking a look into the corresponding .gns3:

                "properties": {
                    "auto_delete_disks": false,
                    "clock_divisor": 8,
                    "console": 2002,
                    "disk0": 1,
                    "disk1": 0,
                    "exec_area": 64,
                    "idlemax": 500,
                    "idlepc": "0x6026b804",
                    "idlesleep": 30,
                    "image": "c3725-adventerprisek9-mz.124-15.T10.image",
                    "image_md5sum": "1f95c093dd0da41cc28057d1c46461e0",
                    "iomem": 5,
                    "mac_addr": "c201.4f85.0000",
                    "mmap": true,
                    "name": "MGT-Switch-11",
                    "nvram": 256,
                    "platform": "c3725",
                    "ram": 128,
                    "slot0": "GT96100-FE",
                    "slot1": "NM-16ESW",
                    "sparsemem": true,
                    "startup_config": "configs/i1_startup-config.cfg",
                    "system_id": "FTX0945W0MY",
                    "wic0": "WIC-2T"
                },
                "server_id": 1,
                "symbol": ":/symbols/multilayer_switch.svg",
                "type": "C3725",
                "vm_id": "**fba15df4-36cf-4476-80eb-ab87c966e858**",
                "x": -610.5,
                "y": 227.5

However, there is no such vm_id in the project-files/qemu folder:

Distributed-Open-Source-Clouds/project-files/qemu#ll
total 76
drwxr-xr-x  2 actionmystique actionmystique  4096 Apr 15 16:46 13bcab20-6269-4293-b1d5-8fb73681dd10/
drwxr-xr-x  2 actionmystique actionmystique  4096 May  2 18:41 36d860b9-ae53-451b-a51e-49c0b4c003c6/
drwxr-xr-x  2 actionmystique actionmystique  4096 May  2 17:59 3a4ff153-4f4f-443f-a550-b2696136e006/
drwxr-xr-x  2 actionmystique actionmystique  4096 May  2 18:41 3f4bf1e4-2771-4de3-a4b6-3f33fee78d1e/
drwxr-xr-x  2 actionmystique actionmystique  4096 May  5 16:42 57f5e633-7977-48a6-bfa9-5fe920dab946/
drwxr-xr-x  2 actionmystique actionmystique  4096 May  5 16:42 6a2e081a-7296-476b-89a6-b75da600726e/
drwxr-xr-x  2 actionmystique actionmystique  4096 May  2 17:59 6ce48e25-b577-4160-ab47-c5d9630c5519/
drwxr-xr-x  2 actionmystique actionmystique  4096 Apr 11 18:45 7a7be774-8246-4600-9311-d3619e6d86a7/
drwxr-xr-x  2 actionmystique actionmystique  4096 Apr 15 16:47 d83ebbf8-84dd-42d9-847e-f91a01aa8a14/
drwxr-xr-x  2 actionmystique actionmystique  4096 May  2 17:59 df00cd8a-d1dc-46a0-af3d-7b886161c8ec/
drwxr-xr-x  2 actionmystique actionmystique  4096 May  2 17:59 dfd532ec-64e7-4385-a22c-6bbb6f9d411f/

I have no idea how exactly this happened; I may have renamed the node in the topology at some point in time, but I have definitely not touched in any way its id. Maybe it has been renamed, its id automatically modified in the folder but not automatically updated in the gns3 topology file.

jean-christophe-manciot commented 8 years ago

Actually, it is probably expected that there is no such vm id in the qemu folder, since this node is a Dynamips one. In that case, how is this id used?

grossmj commented 8 years ago
Response error: Connection refused (error: 1)
Connection to http://127.0.0.1:8000 closed
error while stopping MGT-Switch-11: Connection refused
Server error from http://127.0.0.1:8000: MGT-Switch-11: Connection refused
A device is still running, please stop it before closing your project

I noticed the connection is closed before stopping the devices. Apparently we are not waiting for the server to stop all devices. @noplay can you check?

This could be resolve by implementing https://github.com/GNS3/gns3-gui/issues/1212

@jean-christophe-manciot

I need the server log, you should find it in ~/.config/GNS3/gns3_server.log

Thanks,

grossmj commented 8 years ago

@noplay I think we are not waiting for tasks to complete (stop + save configs) before closing the connection to the Dynamips process.

julien-duponchelle commented 8 years ago

Something is strange all lines are duplicate in the debug log:

=> MGT-Switch-11 is stopping
MGT-Switch-11 is stopping
Connection to http://127.0.0.1:8000
Connection to http://127.0.0.1:8000
GET http://127.0.0.1:8000/v1/version {}
GET http://127.0.0.1:8000/v1/version {}
Decoding response from http://AnIIj0n3Bq3DOgGy065HevlfGu4DKKxMqGG64wysNHM21pC8AtEgpAGA92VASTez@127.0.0.1:8000/v1/version response 200
Decoding response from http://AnIIj0n3Bq3DOgGy065HevlfGu4DKKxMqGG64wysNHM21pC8AtEgpAGA92VASTez@127.0.0.1:8000/v1/version response 200
{
    "local": true,
    "version": "1.4.6"
}
{
    "local": true,
    "version": "1.4.6"
}
POST http://127.0.0.1:8000/v1/projects/d3f1b32d-60af-4481-b8d7-66a875013624/dynamips/vms/fba15df4-36cf-4476-80eb-ab87c966e858/stop {}
POST http://127.0.0.1:8000/v1/projects/d3f1b32d-60af-4481-b8d7-66a875013624/dynamips/vms/fba15df4-36cf-4476-80eb-ab87c966e858/stop {}
Response error: Operation canceled (error: 5)
Response error: Operation canceled (error: 5)
Connection to http://127.0.0.1:8000 closed
Connection to http://127.0.0.1:8000 closed
error while stopping MGT-Switch-11: Operation canceled
error while stopping MGT-Switch-11: Operation canceled
Server error from http://127.0.0.1:8000: MGT-Switch-11: Operation canceled

And the first thing GNS3 is doing is reopening a connection to the server.

jean-christophe-manciot commented 8 years ago

yes, there are errors in the log, for instance line 632+:

2016-05-06 20:26:57 INFO hypervisor.py:142 Stopping Dynamips process PID=13002
2016-05-06 20:26:57 INFO hypervisor.py:142 Stopping Dynamips process PID=15797
2016-05-06 20:26:57 INFO project.py:379 Project d3f1b32d-60af-4481-b8d7-66a875013624 with path '/media/actionmystique/SAMSUNG-Ext4-1/Labs/GNS3/CCNP-Cloud/CLDAUT/Distributed-Open-Source-Clouds' closed
2016-05-06 20:26:57 INFO run.py:243 Remove PID file /root/.config/GNS3/gns3_server.pid
2016-05-06 20:26:57 ERROR base_events.py:1134 Task was destroyed but it is pending!
source_traceback: Object created at (most recent call last):
  File "/usr/local/bin/gns3server", line 9, in <module>
    load_entry_point('gns3-server==1.4.6', 'console_scripts', 'gns3server')()
  File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.4.6-py3.5.egg/gns3server/main.py", line 72, in main
    run()
  File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.4.6-py3.5.egg/gns3server/run.py", line 232, in run
    server.run()
  File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.4.6-py3.5.egg/gns3server/server.py", line 256, in run
    self._loop.run_forever()
  File "/usr/lib/python3.5/asyncio/base_events.py", line 331, in run_forever
    self._run_once()
  File "/usr/lib/python3.5/asyncio/base_events.py", line 1290, in _run_once
    handle._run()
  File "/usr/lib/python3.5/asyncio/events.py", line 125, in _run
    self._callback(*self._args)
  File "/usr/lib/python3/dist-packages/aiohttp/web.py", line 50, in connection_made
    super().connection_made(transport)
  File "/usr/lib/python3/dist-packages/aiohttp/server.py", line 147, in connection_made
    self._request_handler = ensure_future(self.start(), loop=self._loop)
task: <Task cancelling coro=<ServerHttpProtocol.start() running at /usr/lib/python3/dist-packages/aiohttp/server.py:234> wait_for=<Future cancelled created at /usr/lib/python3/dist-packages/aiohttp/streams.py:431> created at /usr/lib/python3/dist-packages/aiohttp/server.py:147>

Whole log: gns3_server.log.txt

grossmj commented 8 years ago

@jean-christophe-manciot can you (privately) share your project please?

Thanks,

grossmj commented 8 years ago

I've pushed a fix. This will have to be tested.

grossmj commented 8 years ago

This should be fixed in 1.5.0 beta 1. Please let us know if you still have issues.

jean-christophe-manciot commented 8 years ago

[Linux/1.5.1] Still some sporadic infinite loops when trying to stop a dynamips node or close a project. An example of the latter is at the end of the log: "Task was destroyed but it is pending!" gns3_server.log.txt .

jean-christophe-manciot commented 8 years ago

Some errors in the dynamips node log:

Aug 11 17:40:59.067 ROM: unhandled syscall 0x0000003e at pc=0x60c00c34 (a1=0x80007de4,a2=0x639511d4,a3=0x000000f8)
Aug 11 17:40:59.067 ROM: unhandled syscall 0x00000047 at pc=0x60c00c34 (a1=0x80007dec,a2=0x639511f8,a3=0x0000011c)
Aug 11 17:41:00.621 ROM: trying to read bootvar 'BOOT'
Aug 11 17:41:00.621 ROM: trying to read bootvar 'CONFIG_FILE'
Aug 11 17:41:00.621 ROM: trying to read bootvar 'BOOTLDR'
Aug 11 17:41:00.621 ROM: trying to read bootvar 'RSHELF'
Aug 11 17:41:00.621 ROM: trying to read bootvar 'DSHELF'
Aug 11 17:41:00.621 ROM: trying to read bootvar 'DSHELFINFO'
Aug 11 17:41:00.621 ROM: trying to read bootvar 'RESET_COUNTER'
Aug 11 17:41:00.621 ROM: trying to read bootvar 'CHRG_LOCRECSN'
Aug 11 17:41:00.622 ROM: trying to read bootvar 'CHRG_ID'
Aug 11 17:41:00.622 ROM: trying to read bootvar 'SLOTCACHE'
Aug 11 17:41:00.622 ROM: trying to read bootvar 'OVERTEMP'
Aug 11 17:41:00.622 ROM: trying to read bootvar 'DIAG'
Aug 11 17:41:00.622 ROM: trying to read bootvar 'DIAMETER_ORIGIN_ID'
Aug 11 17:41:00.622 ROM: trying to read bootvar 'WARM_REBOOT'
Aug 11 17:41:00.776 CPU0: IO_FPGA: read from unknown addr 0x16, pc=0x60916cac (size=2)
Aug 11 17:41:00.776 CPU0: MTS: read  access to undefined address 0x3c080022 at pc=0x6091689c (size=1)
Aug 11 17:41:00.776 CPU0: MTS: read  access to undefined address 0x3c080023 at pc=0x60916924 (size=1)
Aug 11 17:41:00.776 CPU0: MTS: write access to undefined address 0x3c080023 at pc=0x60916930, value=0x00000000 (size=1)
Aug 11 17:41:00.776 CPU0: MTS: read  access to undefined address 0x3c080023 at pc=0x6091693c (size=1)
Aug 11 17:41:00.776 CPU0: MTS: write access to undefined address 0x3c080023 at pc=0x6091694c, value=0x00000080 (size=1)
Aug 11 17:41:00.777 CPU0: MTS: read  access to undefined address 0x3c000002 at pc=0x60918734 (size=1)
Aug 11 17:41:00.777 CPU0: MTS: write access to undefined address 0x3c000002 at pc=0x60918740, value=0x00000080 (size=1)
...
jean-christophe-manciot commented 8 years ago

Alongside the following strange message in the console:

System returned to ROM by unknown reload cause - suspect boot_data[BOOT_COUNT] 0x0, BOOT_COUNT 0, BOOTDATA 19
System image file is "tftp://255.255.255.255/unknown"
jean-christophe-manciot commented 8 years ago

Another type of log when things go south for the Dynamips node when trying to stop it leading to the operation being automatically cancelled

Aug 12 14:44:06.793 CPU0: IO_FPGA: read from unknown addr 0x16, pc=0x60916c6c (size=2)
Aug 12 14:44:06.793 CPU0: IO_FPGA: write to unknown addr 0x16, value=0x1, pc=0x60916c74 (size=2)
Aug 12 14:44:06.829 ROM: trying to read bootvar 'ROM_PERSISTENT_UTC'
Aug 12 14:44:06.957 CPU0: JIT: flushing data structures (compiled pages=1095)
Aug 12 14:44:07.504 ROM: trying to set bootvar 'BSI=0'
Aug 12 14:44:07.504 ROM: trying to read bootvar 'RET_2_RCALTS'
Aug 12 14:44:07.504 ROM: trying to set bootvar 'RET_2_RCALTS='
Aug 12 14:44:07.575 CPU0: JIT: partial JIT flush (count=794)
Aug 12 14:44:07.584 ROM: trying to read bootvar 'RANDOM_NUM'
Aug 12 14:44:16.083 VTTY: Console port is now connected (accept_fd=19,conn_fd=25)
Aug 12 14:44:56.935 CPU0: JIT: flushing data structures (compiled pages=1105)
Aug 12 14:46:43.082 ROM: trying to set bootvar 'RANDOM_NUM=830978794'
Aug 12 18:35:56.880 C3725_STOP: stopping simulation.
Aug 12 18:35:56.880 CPU0: CPU_STATE: Halting CPU (old state=0)...
Aug 12 18:35:56.932 DEVICE: Removal of device WIC-2T(0), fd=-1, host_addr=0x0, flags=0

Then GNS3 loops for a very long time when trying to close the project leading to the type of error described in this previous post with slightly different messages:

2016-08-12 18:43:06 INFO hypervisor.py:150 Stopping Dynamips process PID=31822
2016-08-12 18:45:08 WARNING server.py:110 Server has got signal SIGINT, exiting...
2016-08-12 18:45:08 ERROR route.py:194 Request canceled
2016-08-12 18:45:08 INFO run.py:266 Remove PID file /root/.config/GNS3/gns3_server.pid
2016-08-12 18:45:08 ERROR base_events.py:1148 Task was destroyed but it is pending!
source_traceback: Object created at (most recent call last):
  File "/usr/local/bin/gns3server", line 9, in <module>
    load_entry_point('gns3-server==1.5.1', 'console_scripts', 'gns3server')()
  File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.5.1-py3.5.egg/gns3server/main.py", line 72, in main
    run()
  File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.5.1-py3.5.egg/gns3server/run.py", line 255, in run
    server.run()
  File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.5.1-py3.5.egg/gns3server/server.py", line 256, in run
    self._loop.run_forever()
  File "/usr/lib/python3.5/asyncio/base_events.py", line 345, in run_forever
    self._run_once()
  File "/usr/lib/python3.5/asyncio/base_events.py", line 1304, in _run_once
    handle._run()
  File "/usr/lib/python3.5/asyncio/events.py", line 125, in _run
    self._callback(*self._args)
  File "/usr/lib/python3.5/asyncio/tasks.py", line 307, in _wakeup
    self._step()
  File "/usr/lib/python3.5/asyncio/tasks.py", line 239, in _step
    result = coro.send(None)
  File "/usr/local/lib/python3.5/dist-packages/aiohttp/server.py", line 266, in start
    yield from self.handle_request(message, payload)
  File "/usr/local/lib/python3.5/dist-packages/aiohttp/web.py", line 90, in handle_request
    resp = yield from handler(request)
  File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.5.1-py3.5.egg/gns3server/web/route.py", line 239, in vm_concurrency
    response = yield from control_schema(request)
  File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.5.1-py3.5.egg/gns3server/web/route.py", line 179, in control_schema
    yield from func(request, response)
  File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.5.1-py3.5.egg/gns3server/handlers/api/dynamips_vm_handler.py", line 167, in start
    yield from vm.start()
  File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.5.1-py3.5.egg/gns3server/modules/dynamips/nodes/router.py", line 249, in start
    monitor_process(self._hypervisor.process, self._termination_callback)
  File "/usr/local/lib/python3.5/dist-packages/gns3_server-1.5.1-py3.5.egg/gns3server/utils/asyncio/__init__.py", line 101, in monitor_process
    asyncio.async(_check_process(process, termination_callback))
  File "/usr/lib/python3.5/asyncio/tasks.py", line 532, in async
    return ensure_future(coro_or_future, loop=loop)
task: <Task pending coro=<_check_process() running at /usr/local/lib/python3.5/dist-packages/gns3_server-1.5.1-py3.5.egg/gns3server/utils/asyncio/__init__.py:91> wait_for=<Future cancelled created at /usr/lib/python3.5/asyncio/base_events.py:252> created at /usr/lib/python3.5/asyncio/tasks.py:532>
... repeated
reemaalwail commented 6 years ago

Hi, in the GNS3 when added router is view Server error from 127.0.0.1:8000: R1: Lost communication with 127.0.0.1:49788 :[WinError 64] The specified network name is no longer available, Dynamips process running: True ,,, please help me I do not understand of this Problem