yuanlii / SAND-project

Working documents of SAND project.
0 stars 0 forks source link

Weekly document #19

Open yuanlii opened 4 years ago

yuanlii commented 4 years ago

Oct. 14th, 2019: started to expand the number of hosts running traceroutes to include latency hosts. Account for Open Science Grid ?

Github page: https://github.com/orgs/sand-ci/people Ask Shawn to add as member

TODO: Add github repo | *ask Shawn Improve on statistics ML Front - wait till fix

yuanlii commented 4 years ago

Oct.27,2019: Have finished plotting heatmap and distribution plot for owd_mean. Check code => also with rationale of why using log scale

owd suggestions: 尽量看某个src-dest -> owd是如何变化的 (几周、几个月)

about trace: good: complete without self loop

yuanlii commented 4 years ago

Nov.3rd: Questions: (Nov.3rd)

feedback: BML - CREN => run conclusions

labeled: *drifted: negative values (labeled these) => wrong measurement | synchronization or other problem => problems

可以做的方向: 1.10 sites: positive, negative 同时plot

  1. find the ones go up vs. go down (packet go opposite ways)
  2. BIP => geometric distance vs. one-way delay (3 times must be error, 1.25 times)

-> write the function (找到negative): labeling function & well-explained

yuanlii commented 4 years ago

Nov.11,2019

“simulated annealing” — a common physics algorithm (measure which part of route is drifted, and then find a time delta trying to correct the src_host and dest_host <— on the physicians’ side

pick one source => check the top 10 dest_host that are associated with the source —> able to find out which part the route is drifted *also can check those src_host vs. dest_host with ups and downs

one server can have multiple hosts(source_host, dest_host)

main goal: understand the general cases + understand the exceptional cases

—— general understanding —— each site can have several perfSONAR configured (NTP: —> tell which path is drifted, and how much does it drift)

Ipv6 and Ipv4 should be treated differently

yuanlii commented 4 years ago

Nov.17,2019: Questions: automation => write a function (src_host, dest_host, from_date, to_date, interval): currently can get trends for each specified src_host, dest_host and time range (1 days); what else needed or good to be added? -- e.g., more precise time range (seconds, minutes, hours?)

Screen Shot 2019-11-17 at 10 16 11 PM

[Feedback]: 1.hours would be better as scale

  1. pick the src_host and the ten most frequent dest hosts (ElasticSearch API can directly achieve that) the goal is that we hope to have dest_hosts that have enough measurements we want more general cases: assume mtp is working just fine

  2. think of some correlation measurements (check out packages in Numpy) --> if the source_host is in common, then can find out which dest_hosts & clocks are drifted