Azure / iotedge

The IoT Edge OSS project
MIT License
1.47k stars 462 forks source link

Unable to set network as host, azure-iot-edge GA #63

Closed mortslhawmit closed 6 years ago

mortslhawmit commented 6 years ago

Im working on an IoT Edge Project on a RPI3 one of my edge modules needs access to Bluez on the RPI3. In orter to access this i need the set the IoT Edge Enviroment with --net as host. (This is a very similar to this case.)

My Docker container creation option is: "createOptions": "{ "HostConfig":{ "NetworkMode":"host" } }"

I've upgraded Edge to the GA release but i still get an error.

"Docker.DotNet.DockerApiException: Docker API responded with status code=InternalServerError, response={"message":"failed to add interface veth193ef62 to sandbox: error setting interface \"veth193ef62\" IP to 172.18.0.5/16: cannot program address 172.18.0.5/16 in sandbox interface because it conflicts with existing route"

According to the old issue it should have been fixed in GA, so I'm not certain what i could have done wrong.

myagley commented 6 years ago

Hello. Thanks for the bug report. You should be able to get this working with the following createOptions:

"createOptions": {
  "NetworkingConfig": {
    "EndpointsConfig": {
      "host": {}
    }
  },
  "HostConfig": {
    "NetworkMode": "host"
  }
}

Please let me know if this doesn't work for you.

mortslhawmit commented 6 years ago

I have now updated my "createOptions" but i still get a similar error

Docker.DotNet.DockerApiException: Docker API responded with status code=InternalServerError, response={"message":"failed to add interface vethc600061 to sandbox: error setting interface "vethc600061" IP to 172.18.0.5/16: cannot program address 172.18.0.5/16 in sandbox interface because it conflicts with existing route"

"createOptions": {
    "NetworkingConfig": {
        "EndpointsConfig": {
             "host": {}
        }
    },
    "HostConfig": {
        "NetworkMode": "host"
    }
}

But I did notice that apparently the edge agent is still running in preview, do I need to change something more to get the updated agent? My edgeHub runs the GA image at least.

azureiotedge-agent:1.0-preview"

My system modules in deployment.template.json is:

 "systemModules": {
          "edgeAgent": {
            "type": "docker",
            "settings": {
              "image": "mcr.microsoft.com/azureiotedge-agent:1.0",
              "createOptions": ""
            }
          },
myagley commented 6 years ago

Ah yes, you will need to update all components to the GA bits in order for host network to work correctly. This includes the Edge Agent. We have a blog post that covers the migration steps here: https://azure.microsoft.com/en-us/blog/iot-edge-ga-migration/

mortslhawmit commented 6 years ago

I've missed quite a bit then, been trying to get the upgrade through but struggling to get my images to work, I'm trying to work out what could be the issue both my images get exceptions on "moduleClient.OpenAsync();" but they don't get the same exception, one of the modules get a timeout exception while the other get's an "Module not found, StatusCode: 404" they both worked fine before the update.

Saw this post about the timeout problem, so I tried to swap transport protocols between MQTT and AMQP but still have the same issue. I'm gonna continue search to see if I can find the issue but if you have any pointers that would be much appreciated.

mortslhawmit commented 6 years ago

Hello again, I've now successfully updated to GA and got my modules up and running again. Even though I'm no longer receiving the error I now instead get a timeout exception as soon as I try to run "OpenAsync" on my "ModuleClient", if I remove the "CreateOptions" mentioned above the module runs just fine.

Error i receive now:

Unhandled Exception: System.AggregateException: One or more errors occurred. (Operation timeout expired.) ---> System.TimeoutException: Operation timeout expired.
   at Microsoft.Azure.Devices.Client.InternalClient.<>c.<ApplyTimeout>b__62_2(Task t)
   at System.Threading.Tasks.ContinuationTaskFromTask.InnerInvoke()
   at System.Threading.Tasks.Task.<>c.<.cctor>b__278_1(Object obj)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location where exception was thrown ---
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot)
--- End of stack trace from previous location where exception was thrown ---
   at FirstModule.Main.Program.InitAsync() in /app/Main/Program.cs:line 58
   --- End of inner exception stack trace ---
   at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
   at FirstModule.Main.Program.Main() in /app/Main/Program.cs:line 39

This is my code for starting up the ModuleClient connection (line 53-58 in Program.cs) :

var amqpSetting = new AmqpTransportSettings(TransportType.Amqp_Tcp_Only);
ITransportSettings[] settings = { amqpSetting };
var moduleClient= await ModuleClient.CreateFromEnvironmentAsync(settings);
await moduleClient.OpenAsync();

Friendly ping @myagley

mortslhawmit commented 6 years ago

Am I missing something? Please give some feedback Friendly ping @myagley

myagley commented 6 years ago

Can you please post the complete createOptions for the Edge Hub and your module?

mortslhawmit commented 6 years ago

I had not undestood that i needed to update the edgeHub aswell, when i updated it to the same as my customModules createOptionsit works fine.

Both my createOptions now "createOptions": "{\"NetworkingConfig\": {\"EndpointsConfig\": {\"host\": {}}}, \"HostConfig\": {\"NetworkMode\": \"host\" }}"

Sorry for that! And thanks for the help.

mortslhawmit commented 6 years ago

Just a follow up question, is it required for all my modules to have the same createoptions as edgeHub for them to be able to connect?

(I accidentally closed the post on my earlier reply sorry..)

myagley commented 6 years ago

I think you have two options here. You either put the Edge Hub into the host network namespace (by updating the createOptions like you did) or you port binding the Edge Hub to host network, and have the module connect on the public interface.

By default the Edge Hub is setup to port binding 8883, 5671, and 443 with a default set of createOptions. You can of course override this if you'd like.

myagley commented 6 years ago

Closing this due to inactivity. Please feel free to reopen if you have questions/concerns.

LiangJy123 commented 5 years ago

Hi @myagley , Is that different between Windows and Linux ? I try the createOptions On Windows , and get the error from the log of edgeAgent:

<3> 2019-09-02 20:37:55.924 +08:00 [ERR] - Edge agent plan execution failed.
System.AggregateException: One or more errors occurred. (Error calling start module GetDataModule: Could not start module GetDataModule
        caused by: Could not start module GetDataModule
        caused by: network host not found) (Error calling start module edgeHub: Could not start module edgeHub
        caused by: Could not start module edgeHub
        caused by: network host not found) ---> Microsoft.Azure.Devices.Edge.Agent.Edgelet.EdgeletCommunicationException: Error calling start module GetDataModule: Could not start module GetDataModule
        caused by: Could not start module GetDataModule
        caused by: network host not found
   at Microsoft.Azure.Devices.Edge.Agent.Edgelet.Version_2019_01_30.ModuleManagementHttpClient.HandleException(Exception exception, String operation) in C:\agent\_work\4\s\edge-agent\src\Microsoft.Azure.Devices.Edge.Agent.Edgelet\version_2019_01_30\ModuleManagementHttpClient.cs:line 194

The main problem is "network host not found".

So I check my docker network driver: image

I think the same command run on Linux is like this(Picture copied on the Internat): image

My question is : How can I set the network as host on Windows?

Thank you.

efog commented 5 years ago

Hi,

I run in the exact same issue

2019-09-24 15:03:16.234 +00:00 [ERR] - Edge agent plan execution failed. System.AggregateException: One or more errors occurred. (Error calling start module cameraedgemodule148: Could not start module cameraedgemodule148 caused by: Could not start module cameraedgemodule148 caused by: network host not found) ---> Microsoft.Azure.Devices.Edge.Agent.Edgelet.EdgeletCommunicationException: Error calling start module cameraedgemodule148: Could not start module cameraedgemodule148 caused by: Could not start module cameraedgemodule148 caused by: network host not found at Microsoft.Azure.Devices.Edge.Agent.Edgelet.Version_2019_01_30.ModuleManagementHttpClient.HandleException(Exception exception, String operation) in C:\agent\_work\4\s\edge-agent\src\Microsoft.Azure.Devices.Edge.Agent.Edgelet\version_2019_01_30\ModuleManagementHttpClient.cs:line 194 at Microsoft.Azure.Devices.Edge.Agent.Edgelet.Versioning.ModuleManagementHttpClientVersioned.Execute[T](Func1 func, String operation) in C:\agent_work\4\s\edge-agent\src\Microsoft.Azure.Devices.Edge.Agent.Edgelet\versioning\ModuleManagementHttpClientVersioned.cs:line 124 at Microsoft.Azure.Devices.Edge.Agent.Edgelet.Version_2019_01_30.ModuleManagementHttpClient.StartModuleAsync(String name) in C:\agent_work\4\s\edge-agent\src\Microsoft.Azure.Devices.Edge.Agent.Edgelet\version_2019_01_30\ModuleManagementHttpClient.cs:line 149 at Microsoft.Azure.Devices.Edge.Agent.Core.LoggingCommandFactory.LoggingCommand.ExecuteAsync(CancellationToken token) in C:\agent_work\4\s\edge-agent\src\Microsoft.Azure.Devices.Edge.Agent.Core\LoggingCommandFactory.cs:line 60 at Microsoft.Azure.Devices.Edge.Agent.Core.Commands.GroupCommand.ExecuteAsync(CancellationToken token) in C:\agent_work\4\s\edge-agent\src\Microsoft.Azure.Devices.Edge.Agent.Core\commands\GroupCommand.cs:line 35 at Microsoft.Azure.Devices.Edge.Agent.Core.LoggingCommandFactory.LoggingCommand.ExecuteAsync(CancellationToken token) in C:\agent_work\4\s\edge-agent\src\Microsoft.Azure.Devices.Edge.Agent.Core\LoggingCommandFactory.cs:line 60 at Microsoft.Azure.Devices.Edge.Agent.Core.PlanRunners.OrderedRetryPlanRunner.ExecuteAsync(Int64 deploymentId, Plan plan, CancellationToken token) in C:\agent_work\4\s\edge-agent\src\Microsoft.Azure.Devices.Edge.Agent.Core\planrunners\OrdererdRetryPlanRunner.cs:line 87 --- End of inner exception stack trace --- at Microsoft.Azure.Devices.Edge.Agent.Core.PlanRunners.OrderedRetryPlanRunner.<>c.b__7_0(List1 f) in C:\agent\_work\4\s\edge-agent\src\Microsoft.Azure.Devices.Edge.Agent.Core\planrunners\OrdererdRetryPlanRunner.cs:line 115 at Microsoft.Azure.Devices.Edge.Agent.Core.PlanRunners.OrderedRetryPlanRunner.ExecuteAsync(Int64 deploymentId, Plan plan, CancellationToken token) in C:\agent\_work\4\s\edge-agent\src\Microsoft.Azure.Devices.Edge.Agent.Core\planrunners\OrdererdRetryPlanRunner.cs:line 116 at Microsoft.Azure.Devices.Edge.Agent.Core.Agent.ReconcileAsync(CancellationToken token) in C:\agent\_work\4\s\edge-agent\src\Microsoft.Azure.Devices.Edge.Agent.Core\Agent.cs:line 134 ---> (Inner Exception #0) Microsoft.Azure.Devices.Edge.Agent.Edgelet.EdgeletCommunicationException- Message:Error calling start module cameraedgemodule148: Could not start module cameraedgemodule148 caused by: Could not start module cameraedgemodule148 caused by: network host not found, StatusCode:404, at: at Microsoft.Azure.Devices.Edge.Agent.Edgelet.Version_2019_01_30.ModuleManagementHttpClient.HandleException(Exception exception, String operation) in C:\agent\_work\4\s\edge-agent\src\Microsoft.Azure.Devices.Edge.Agent.Edgelet\version_2019_01_30\ModuleManagementHttpClient.cs:line 194 at Microsoft.Azure.Devices.Edge.Agent.Edgelet.Versioning.ModuleManagementHttpClientVersioned.Execute[T](Func1 func, String operation) in C:\agent_work\4\s\edge-agent\src\Microsoft.Azure.Devices.Edge.Agent.Edgelet\versioning\ModuleManagementHttpClientVersioned.cs:line 124 at Microsoft.Azure.Devices.Edge.Agent.Edgelet.Version_2019_01_30.ModuleManagementHttpClient.StartModuleAsync(String name) in C:\agent_work\4\s\edge-agent\src\Microsoft.Azure.Devices.Edge.Agent.Edgelet\version_2019_01_30\ModuleManagementHttpClient.cs:line 149 at Microsoft.Azure.Devices.Edge.Agent.Core.LoggingCommandFactory.LoggingCommand.ExecuteAsync(CancellationToken token) in C:\agent_work\4\s\edge-agent\src\Microsoft.Azure.Devices.Edge.Agent.Core\LoggingCommandFactory.cs:line 60 at Microsoft.Azure.Devices.Edge.Agent.Core.Commands.GroupCommand.ExecuteAsync(CancellationToken token) in C:\agent_work\4\s\edge-agent\src\Microsoft.Azure.Devices.Edge.Agent.Core\commands\GroupCommand.cs:line 35 at Microsoft.Azure.Devices.Edge.Agent.Core.LoggingCommandFactory.LoggingCommand.ExecuteAsync(CancellationToken token) in C:\agent_work\4\s\edge-agent\src\Microsoft.Azure.Devices.Edge.Agent.Core\LoggingCommandFactory.cs:line 60 at Microsoft.Azure.Devices.Edge.Agent.Core.PlanRunners.OrderedRetryPlanRunner.ExecuteAsync(Int64 deploymentId, Plan plan, CancellationToken token) in C:\agent_work\4\s\edge-agent\src\Microsoft.Azure.Devices.Edge.Agent.Core\planrunners\OrdererdRetryPlanRunner.cs:line 87<--- ` Create options have been added here:

"cameraedgemodule148": { "settings": { "image": "edgendaiotedgeacr.azurecr.io/edgendaazureiotcameraedgemodule:latest", "createOptions": "{\"NetworkingConfig\":{\"EndpointsConfig\":{\"host\":{}}},\"HostConfig\":{\"NetworkMode\":\"host\"}}" }, "type": "docker", "version": "1.0", "status": "running", "restartPolicy": "always" }, The Edge is running Windows. Is this a supported configuration?

keshava-hm commented 5 years ago

Whether this solution works in Windows OS with Windows containers?

Is there a way to connect to host from module?

Thanks

davidzwa commented 4 years ago

Just a small tip, Linux Containers on Windows apparently run within a Linux VM. This VM has an internal network interface not equivalent to the host network interface. I think its called something like gateway.docker.internal

So any Linux on linux combination should work, as well as Windows on Windows (not sure about this one). I will come back once Ive tried L-on-L

kgalic commented 4 years ago

After settings the create options for network as described, with the latest IoT Edge 1.0.9.3 and DeviceSDK 1.26.0 I get the following exception: Unhandled exception. System.AggregateException: One or more errors occurred. (Transient network error occurred, please retry.) ---> Microsoft.Azure.Devices.Client.Exceptions.IotHubCommunicationException: Transient network error occurred, please retry.

---> System.Net.Internals.SocketExceptionFactory+ExtendedSocketException (00000005, 0xFFFDFFFF): Name or service not known
   at System.Net.Dns.InternalGetHostByName(String hostName)
   at System.Net.Dns.ResolveCallback(Object context)
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw(Exception source)
   at System.Net.Dns.HostResolutionEndHelper(IAsyncResult asyncResult)
   at System.Net.Dns.EndGetHostAddresses(IAsyncResult asyncResult)
   at System.Net.Dns.<>c.<GetHostAddressesAsync>b__25_1(IAsyncResult asyncResult)
   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)

This happens when the module tries to connect via ModuleClient. Removing the configuration it works. What is the conflict here?

"createOptions": {
  "NetworkingConfig": {
    "EndpointsConfig": {
      "host": {}
    }
  },
  "HostConfig": {
    "NetworkMode": "host"
  }
}
amunozh commented 2 years ago

After settings the create options for network as described, with the latest IoT Edge 1.0.9.3 and DeviceSDK 1.26.0 I get the following exception: Unhandled exception. System.AggregateException: One or more errors occurred. (Transient network error occurred, please retry.) ---> Microsoft.Azure.Devices.Client.Exceptions.IotHubCommunicationException: Transient network error occurred, please retry.

---> System.Net.Internals.SocketExceptionFactory+ExtendedSocketException (00000005, 0xFFFDFFFF): Name or service not known
   at System.Net.Dns.InternalGetHostByName(String hostName)
   at System.Net.Dns.ResolveCallback(Object context)
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw(Exception source)
   at System.Net.Dns.HostResolutionEndHelper(IAsyncResult asyncResult)
   at System.Net.Dns.EndGetHostAddresses(IAsyncResult asyncResult)
   at System.Net.Dns.<>c.<GetHostAddressesAsync>b__25_1(IAsyncResult asyncResult)
   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)

This happens when the module tries to connect via ModuleClient. Removing the configuration it works. What is the conflict here?

"createOptions": {
  "NetworkingConfig": {
    "EndpointsConfig": {
      "host": {}
    }
  },
  "HostConfig": {
    "NetworkMode": "host"
  }
}

I'm facing the same problem. I have a set of modules in the bridge network of IoT Edge, and one of the modules is working with network host because I need to access to bluetooth stack. But I'm getting the followign error:

ERROR 2022/05/20 15:39:48 PM Subscribe for input failed.  Not enabling feature
INFO 2022/05/20 15:39:48 PM Callback completed with error ConnectionFailedError(None) caused by gaierror(-3, 'Temporary failure in name resolution')
INFO 2022/05/20 15:39:48 PM ["azure.iot.device.common.transport_exceptions.ConnectionFailedError: ConnectionFailedError(None) caused by gaierror(-3, 'Temporary failure in name resolution')\n"]
WARNING 2022/05/20 15:39:48 PM Unexpected error while creating client ConnectionFailedError('Could not connect to IoTHub') caused by ConnectionFailedError(None)
goncalog commented 8 months ago

did you find a solution for your issue @amunozh?