alibaba / clusterdata

cluster data collected from production clusters in Alibaba for cluster management research
1.56k stars 404 forks source link

Microservice traces: issues with "services"? #158

Open MaxLanLiu opened 1 year ago

MaxLanLiu commented 1 year ago

Hi I have a few questions regarding identifying services in the traces. Thanks in advance and any help is sincerely appreciated!

1) When you anonymized the data, did you consistently map a service name to a corresponding string? Is the example below in the picture true?

eg

2) Service vs. Online service??? In the README's Discussion, the answer to "How to identify a specific service in the traces?" is "In practice, microservice architecture adopts proxy modules like Nginx to forward users' requests to an entering MS. The entering MS, e.g., interface of entering MS A in Fig.2, containing multiple interfaces, and each interface provides a specific service. As such, each interface of MS called by proxy MS could be labelled as an online service. As revealed in the paper, we classify the call graphs of each online service into different classes when analyzing the dynamics of microservices. It is worth noting that, some call graphs could even contain two proxy modules, and the name of the second proxy MS are usually recorded as '(?)' or ''. "

I am super confused by how you use different terms.

2a) Services: What is the difference between an "online service" vs. a "service"? 2b) Interface: By interface, do you mean function/method? Or an entire service? 2c) Proxy: Is a proxy always a microservice? Can we tell which um/dm in the data are the proxies? Is the set of first services the proxies? How do we know when an "(?)" in the um/dm is a result of some missing data or because of the service being the second proxy?

3) I found some consistent patterns in the calls that have rpcid as "0" or rpcid "0.1".

3a) Almost always, the calls with rpcid "0" 's um is an empty string. What does this mean? Is the dm of the call always a proxy or does it count as an entering service or online service?

0

3b) Usually, not always, the calls with rpcid "0.1" has um as "(?)". Do you have any intuition why this is the case?

0 1

Thank you so much!

kapilagrawal95 commented 1 year ago

I had the same doubt as (1).

Furthermore, in (3) you show two tables where the um_id is either NULL or "(?)" and there are a lot of such instances where the um name is this value in the call_graphs files.

@MaxLanLiu : Were you able to find answers for your queries?