Subject: Question about 'UNAVAILABLE' and 'UNKNOWN' values in MSCallGraph data
Description:
Hello,
I hope this message finds you well. Firstly, I would like to express my sincere gratitude for open-sourcing the 'cluster-trace-microservices-v2022' dataset. It has been instrumental in my research efforts.
I have been exploring the MSCallGraph data, and I noticed that the 'um' column contains numerous 'UNAVAILABLE' and 'UNKNOWN' entries. Specifically, I have a few questions regarding this, illustrated with the following dataset snippet:
timestamp
traceid
service
rpc_id
rpctype
um
uminstanceid
interface
dm
dminstanceid
rt
666816332
T_20087312301
S_38528029
0
http
USER
USER
7H3KUQVcpx
MS_3386
MS_3386_POD_1
3.0
666816333
T_20087312301
S_38528029
0.1
rpc
UNAVAILABLE
UNAVAILABLE
7H3KUQVcpx
MS_70648
MS_70648_POD_170
1.0
666816333
T_20087312301
S_38528029
0.1.1
http
UNKNOWN
UNAVAILABLE
7H3KUQVcpx
MS_38945
MS_38945_POD_107
0.0
666816333
T_20087312301
S_38528029
0.1.1.1
http
MS_38945
MS_38945_POD_107
MaC4sWx7iJ
MS_30732
MS_30732_POD_33
0.0
My questions are:
What is the reason for the presence of 'UNAVAILABLE' and 'UNKNOWN' in the 'um' column? Specifically, how are records like the one above generated where 'rpc_id' seems complete, but 'um' is 'UNKNOWN' or 'UNAVAILABLE'?
Is it possible to infer 'um' based on 'dm'? For example, can we repair the data by substituting 'um' based on the corresponding 'dm'? The repaired data might look like this:
timestamp
traceid
service
rpc_id
rpctype
um
uminstanceid
interface
dm
dminstanceid
rt
666816332
T_20087312301
S_38528029
0
http
USER
USER
7H3KUQVcpx
MS_3386
MS_3386_POD_1
3.0
666816333
T_20087312301
S_38528029
0.1
rpc
MS_3386
MS_3386_POD_1
7H3KUQVcpx
MS_70648
MS_70648_POD_170
1.0
666816333
T_20087312301
S_38528029
0.1.1
http
MS_70648
MS_70648_POD_170
7H3KUQVcpx
MS_38945
MS_38945_POD_107
0.0
666816333
T_20087312301
S_38528029
0.1.1.1
http
MS_38945
MS_38945_POD_107
MaC4sWx7iJ
MS_30732
MS_30732_POD_33
0.0
How is the entry point MS determined in the trace? If a trace contains data with 'rpc_id=0', is the entry point considered 'USER'? In cases where there is no 'rpc_id=0' data, and 'rpc_id' starts from '0.1' with 'um' being 'UNAVAILABLE' or 'UNKNOWN', like in this example, is 'rpc_id=0.1' considered the entry point MS?
timestamp
traceid
service
rpc_id
rpctype
um
uminstanceid
interface
dm
dminstanceid
rt
666803251
T_14180572390
S_38528029
0.1
rpc
UNAVAILABLE
UNAVAILABLE
7H3KUQVcpx
MS_70648
MS_70648_POD_162
20.0
666803254
T_14180572390
S_38528029
0.1.1
http
UNKNOWN
UNAVAILABLE
7H3KUQVcpx
MS_38945
MS_38945_POD_97
3.0
666803254
T_14180572390
S_38528029
0.1.1.1
http
MS_38945
MS_38945_POD_97
ihrQqyYug4
MS_30732
MS_30732_POD_0
3.0
timestamp
traceid
service
rpc_id
rpctype
um
uminstanceid
interface
dm
dminstanceid
rt
666796398
T_1445790167
S_156482560
0.1
http
UNKNOWN
UNAVAILABLE
bHiJXTZtx1
MS_40912
MS_40912_POD_290
2.0
666796398
T_1445790167
S_156482560
0.1.1
mc
MS_40912
MS_40912_POD_290
ZaYxnA3U_f
MS_58269
MS_58269_POD_25
1.0
666796399
T_1445790167
S_156482560
0.1.2
mc
MS_40912
MS_40912_POD_290
ZaYxnA3U_f
MS_58269
MS_58269_POD_5
0.0
I would appreciate any insights you can provide on these matters. Thank you for your time and for maintaining this valuable dataset.
Subject: Question about 'UNAVAILABLE' and 'UNKNOWN' values in MSCallGraph data
Description:
Hello,
I hope this message finds you well. Firstly, I would like to express my sincere gratitude for open-sourcing the 'cluster-trace-microservices-v2022' dataset. It has been instrumental in my research efforts.
I have been exploring the MSCallGraph data, and I noticed that the 'um' column contains numerous 'UNAVAILABLE' and 'UNKNOWN' entries. Specifically, I have a few questions regarding this, illustrated with the following dataset snippet:
My questions are:
What is the reason for the presence of 'UNAVAILABLE' and 'UNKNOWN' in the 'um' column? Specifically, how are records like the one above generated where 'rpc_id' seems complete, but 'um' is 'UNKNOWN' or 'UNAVAILABLE'?
Is it possible to infer 'um' based on 'dm'? For example, can we repair the data by substituting 'um' based on the corresponding 'dm'? The repaired data might look like this:
I would appreciate any insights you can provide on these matters. Thank you for your time and for maintaining this valuable dataset.
Best regards