microsoft / ApplicationInsights-ServiceFabric

ApplicationInsights SDK for ServiceFabric projects
MIT License
63 stars 26 forks source link

ExclusiveProcess tracing of remote calls does not pass operationId #65

Closed mkomorowsk closed 6 years ago

mkomorowsk commented 6 years ago

Hi we have a setup with several services that all live in their separate processes (ExclusiveProcess) and they communicate using remoting V1. I have used the setup described but can't get the operationId passed to the service being called. I log the operationId from the caller and then from callee - it comes as some hashed value. Are we going to be happy with this, or exclusive processes are not supported?

I do: traces | project timestamp, message, operation_Id | top 40 by timestamp desc

and see:

2018-05-16T10:17:08.860 Stateful1 is running: HelloWorld: Param=2018-05-16 12:17:08 - 122   P1TwdX7esgk=    
2018-05-16T10:17:08.854 Stateless1 is running: 122  2018-05-16 12:17:08 - 122   
2018-05-16T10:17:03.845 Stateful1 is running: HelloWorld: Param=2018-05-16 12:17:03 - 121   T3iCzQ4FoAs=    
2018-05-16T10:17:03.840 Stateless1 is running: 121  2018-05-16 12:17:03 - 121   
2018-05-16T10:16:58.825 Stateful1 is running: HelloWorld: Param=2018-05-16 12:16:58 - 120   sFtv6Fsv1No=    
2018-05-16T10:16:58.821 Stateless1 is running: 120  2018-05-16 12:16:58 - 120   
2018-05-16T10:16:53.812 Stateful1 is running: HelloWorld: Param=2018-05-16 12:16:53 - 119   uNxyYsZeLuU=    
2018-05-16T10:16:53.808 Stateless1 is running: 119  2018-05-16 12:16:53 - 119   

every 'hashy' value of operationId should be the one from the Stateless1 that makes the call. Thanks Maciej

yantang-msft commented 6 years ago

@mkomorowsk What's the AI SDK version? It should be at least 2.4.0. Can you also try running this query? dependencies | where cloud_RoleName == "Stateless1" | distinct id You should see the id in the format like here, they will let us know if the correlation is working correctly on the caller side.

mkomorowsk commented 6 years ago

Thanks. I have AI in version 2.6.1. I see that no data gets collected in 'dependencies' but I could not find a reason for it (spent one day). Can you point me to most often seen reasons? Could lack of that data be a reason for traceability not to work?

yantang-msft commented 6 years ago

If you set it up correctly, it should track the dependency telemetry. If no dependency telemetry is sent, most likely there is something wrong with the setup and thus the correlation is not working.

So you added the CorrelatingServiceProxyFactory in to the Stateless1 service as described here? https://github.com/Microsoft/ApplicationInsights-ServiceFabric#remoting-v1

Can you also check the Interface of the Stateful1 and make sure it's not using Remoting V2 attribute like [assembly: FabricTransportServiceRemotingProvider(RemotingListener = RemotingListener.V2Listener, RemotingClient = RemotingClient.V2Client)]? Latest SF tools will by default create project using Remoting V2.

mkomorowsk commented 6 years ago

I did all as described for V1 and then too for V2 of remoting. I don't get dependecies tracked in any of the solutions. The attribute for remoting v2 I had to add by hand, coz it was not added by VS. Still trying to find a reason for dependencies not to work. I created a new AppInsights resource but no change.

yantang-msft commented 6 years ago

@mkomorowsk I created a simple example for you to reference using Remoting V2. https://github.com/yantang-msft/service-fabric-application-insights-example/tree/master

You can find more examples if the above one is not enough https://github.com/yantang-msft/service-fabric-dotnet-getting-started/tree/TestRemoting

mkomorowsk commented 6 years ago

Thanks a lot for help with this. Taken your sample and it nicely logs the dependencies but no traces, it is me maybe? :) Happy that my local cluster CAN log dependencies. Anyway I followed the sample by adding the attribute to the interface assembly as well (was missing) and now keep getting in traces things like:

AI (Internal): Operation to stop does not match the current operation. Details: Telemetry Id '|OA82iJwiVMs=.2b4c6274_' does not match current Activity '|OA82iJwiVMs=.2b4c6274_1.'

Remoting calls go through, no exceptions. I think I missed the point by creating my own TelemetryClient and logging stuff with it (at that time I had traces and no dependencies). Now, I tried to replace that code with using the built-in client logging like this:

var config = TelemetryConfiguration.Active;
config.TelemetryChannel.Send(new EventTelemetry("call is in STATEFULL 2 !!!!!!!!!!!!!!"));

but as I said I have no entries in AI in traces like that. I don't want to have the ServiceEventSource as we're going to add more services automatically to SFC and want no static names of event sources.

How do I log an entry into 'traces' so I could see the operation_id coming from the caller (we want to use our own GUID there) and showing up in the method that was called over remoting? Thanks again!

yantang-msft commented 6 years ago

What kind of traces are you expecting? Your own traces or metrics telemetry that are generated automatically? For you own traces, you will still need a telemetryClient and call TrackTrace(). And the traces will be correlated automatically with the operation_id coming from the caller. It would be tricky if you want to use you own GUID, and I'm not sure if it's a good idea to use your own GUID since it will break the correlation.

For the AI (Internal) exception, it has been reported at here https://github.com/Microsoft/ApplicationInsights-ServiceFabric/issues/64, we are investigating and making progress.

karolz-ms commented 6 years ago

@mkomorowsk Do not create the telemetry object imperatively and do not use the channel directly like in the example you gave above. This bypasses pretty much the whole AppInsights pipeline, which usually has telemetry processors that you want, enriching telemetry with context, doing sampling/filtering etc. Use TelemetryClient, like Yanming said

mkomorowsk commented 6 years ago

Thanks, both of you for help. I am back to tracing with TelemetryClient then.

Maybe some explanation: The thing with GUID as an operation Id is that we're not using any web API facade to let the requests enter the SFC services. We have a service that picks the messages from Service Bus and does processing of them with help of several other SFC services we have. If there's any failure in the processing we would like to know where exactly the flow stucked. That is why we want to use the MessageId (GUID) as Operation Id to be tracked. Right now, with all setup as you helped me, I get the tracing working but without the first service that initiates the whole call chain. The one that picks up from SB should too log info "got Msg, starting to process" and that is why I thought it could too set the operation_Id at that point. I am guessing that you made some magic to set the initial operation_id for the web API calls going into the SFC, can you share what kind of? :) I would then do the same to get the operation_id inserted, so it flows the way through the remoting calls. I call this to get it collected:

requests | union traces | where operation_Id == "6mtKWXibZc0=" | order by timestamp asc

much more preferred to get the results by calling something like:

traces |where operation_Id == "<Service Bus Message ID Guid>" | order by timestamp asc

Thanks!

yantang-msft commented 6 years ago

@mkomorowsk So the "magic" is when webApi receives a request, it will start an Activity. the Activity is where the operationId is initialized, and service remoting set the telemetry's correlation info (operation_id, parentId and its own telemetry id) from Activity.Current.
You can check out the Activity UserGuide and some other docs for more info.

In your case, you want to create an Activity manually, initialize the operation id by the SB Message ID Guid and start the Activity. Then everything will be correlated by the SB Message ID afterwards.

karolz-ms commented 6 years ago

Also take a look at https://docs.microsoft.com/en-us/azure/application-insights/application-insights-custom-operations-tracking