Azure / Industrial-IoT

Azure Industrial IoT Platform
MIT License
524 stars 214 forks source link

Unable to add application per discovery #2275

Closed kappa-lhirsch closed 3 months ago

kappa-lhirsch commented 4 months ago

Describe the bug I use the Industrial Web Api and the publisher module to communicate with my OPC-Server.

Both running Version 2.9.9

When I try to add a application via the API POST https://{{OPC-SERVICEURL}}/registry/v2/applications with the discoveryUrl parameter. the publisher module is showing the following logs:

[24-07-01 08:34:43.1406] info: Azure.IIoT.OpcUa.Publisher.Discovery.ProgressPublisher[0]
      (null): Discovery operation started.
[24-07-01 08:34:43.2211] info: Azure.IIoT.OpcUa.Publisher.Discovery.ProgressPublisher[0]
      (null): Searching 1 discovery urls for endpoints...
[24-07-01 08:34:43.2352] info: Azure.IIoT.OpcUa.Publisher.Discovery.ProgressPublisher[0]
      (null): Trying to find endpoints on opc.tcp://192.168.11.12:4840/...
[24-07-01 08:34:44.8880] info: Azure.IIoT.OpcUa.Publisher.Discovery.ProgressPublisher[0]
      (null): Found 1 endpoints on opc.tcp://192.168.11.12:4840/.
[24-07-01 08:34:44.8948] info: Azure.IIoT.OpcUa.Publisher.Discovery.ProgressPublisher[0]
      (null): Found total of 1 servers ...
[24-07-01 08:34:44.9115] info: Azure.IIoT.OpcUa.Publisher.Discovery.NetworkDiscovery[0]
      Uploading 1 results...
[24-07-01 08:34:45.0740] info: Azure.IIoT.OpcUa.Publisher.Discovery.NetworkDiscovery[0]
      1 results uploaded.
[24-07-01 08:34:45.0756] info: Azure.IIoT.OpcUa.Publisher.Discovery.ProgressPublisher[0]
      (null): Discovery operation completed.

Although this output seems normal to me, the application does not show up when getting the applications in the registry.

To Reproduce Steps to reproduce the behavior:

  1. Install IIoT publisher module version 2.9.9
  2. Install IIoT web api version 2.9.9
  3. See above

Expected behavior The application shows up in the registry

marcschier commented 4 months ago

I called the POST API using the service CLI tool (apps add -u <url>) and it worked just fine. Have you looked at the server log to see if there are any errors reported?

kappa-lhirsch commented 4 months ago

Is there any documentation about the usage of the cli?

marcschier commented 4 months ago

The cli which we also publish as a container image (mcr.microsoft.com/iot/industrial-cli:latest) has a help, you can start it in console mode by passing the "console" keyword as first argument. The code is in https://github.com/Azure/Industrial-IoT/blob/main/src/Azure.IIoT.OpcUa.Publisher.Service.Sdk/cli/Program.cs. In essence it is just a CLI on top of the REST api, but not meant for production use, rather for testing/exercising the API.

kappa-lhirsch commented 4 months ago

I noticed some interesting behavior with the API. First of all I'm still not able to discover servers via the API. However when i make a PUT call to the endpoints endpoint https://{{OPC-SERVICEURL}}/registry/v2/endpoints with the same discoveryUrl parameter, the endpoint and also the application it belongs to is discovered.

My second question is, how can i connect to an endpoint, because their is no connect route available anymore in version 2.9.9?

marcschier commented 4 months ago

Quick comment: Connect happens now automatically when you send your first request. No need to activate/deactivate in 2.9.

marcschier commented 4 months ago

Yes there are essentially 3 API calls that support discovery:

  1. Register Server
  2. Discover Servers
  3. Register single endpoint and associated server

(application == server)

You can use the endpoint id (e.g. from List endpoints) in the opc ua service calls.

kappa-lhirsch commented 4 months ago

In my case only the 3rd endpoint works

  1. Register single endpoint and associated server

The other 2 won't add the server to the registry

marcschier commented 4 months ago

Can you provide an example payload you used?

kappa-lhirsch commented 3 months ago

I just used the discovery Url parameter since the other parameters are optional:

{
  "discoveryUrl": "opc.tcp://xxx.xxx.xxx.xxx:4840"
}
marcschier commented 3 months ago

... For the first API call, is that right? That would be correct. Are there any errors displayed in the logs of the publisher or returned as result?

kappa-lhirsch commented 3 months ago

Yes for the first API call, no I don't get any errors in the publisher logs. The output is as mentioned in the initial post. I don't get a error response back, although I never gotten any useful error back from the API.

kappa-lhirsch commented 3 months ago

Quick comment: Connect happens now automatically when you send your first request. No need to activate/deactivate in 2.9.

I tried the new version and to my disappointment the publisher module established a new connection on every request, not only the first one. The result of this is that requests take a absurd amount of time. Browsing node for example takes in V2.9.2 about 400ms. In V2.9.9 it takes over 5s every time!!

Because of the long respones time the request often simply times out after about 5.6 seconds. I have not seen any option to change this timeout.

Also sometimes the request failes with error: 'Response 410 : Request was canceled by the client or after timeout.' Or with: 'Response 500 : BadRequestInterrupted'

It seems pretty random to me which error occures when. Simply by doing the same request over and over I get these two errors and sometimes a valid response.

This alone forces me to still use 2.9.2, because my application is unusable with V2.9.9.

marcschier commented 3 months ago

To cache session between requests set the --cl value to a second count, e.g. 5 seconds, then the next request within 5 seconds will reuse the previous client/connection. The default is 0, no lingering for connections. This was also the case in 2.9.2. So maybe there is another issue at play here.

Browse /HistoryRead automatically linger using the continuation point as ref-count token, so doing browse, next, next, next will reuse the same connection. This was recently fixed when the continuation token was the same from call to call (kepserver).

kappa-lhirsch commented 3 months ago

Interesting, because In 2.9.2 I had to do a request to the /connect endpoint, and after that the /browse endpoint used the same connection.

marcschier commented 3 months ago

Yes, that was possible. But - essentially all connections are ref-counted, the --cl sets a timer to decrease the ref count, connect/disconnect takes a ref count, but there is not timeout then, the connection will forever be open. And then handling connect/disconnect with ref counting correctly is difficult over network (e.g. first succeeds to connect, but return fails, so you retry, and then 2 ref counted up). Best, simplest, is --cl and just letting publisher manage the connection.

marcschier commented 3 months ago

I cannot reproduce the original issue. It looks like the issue is at the web api level and that the onboarding information is stuck on the event hub endpoint (possibly a lot of messages it has to process before it receives the onboarding message with the application in it). Use the other api as they are more reliable (they use request/response model not the streaming discovery via messages).