ovis-hpc / ldms

OVIS/LDMS High Performance Computing monitoring, analysis, and visualization project.
https://github.com/ovis-hpc/ovis-wiki/wiki
Other
100 stars 52 forks source link

store_sos: Replace assertion with type mismatch error message #1528

Closed nichamon closed 1 day ago

nichamon commented 2 days ago

Address Issue #1524

tom95858 commented 2 days ago

@nichamon, could you set up a test that configures the filesingle sampler and stores that data. Then change the filesingle configuration on the sampler and demonstrate that we get reasonable log messages?

baallan commented 2 days ago

Store-metric also needs to check set metric name == expected name in existing store schema; so far we usually got lucky there was also a type conflict.

nichamon commented 2 days ago

@tom95858 I can do that. However, I did some testing before I created the pull request using the test_sampler plugin. The new log messages got reported when: A sampler daemon created a set of two metrics of U64 type using test_sampler. An aggregator collected and stored the set using store_sos. Then, I killed both daemons. I changed the type of a metric and restarted the sampler daemon and the aggregator. store_sos reported the log messages from the patch.

Besides the above case, I also tested another scenario where I only restarted the sampler daemon and let the aggregator running. The storage policy logic caught the schema change and reported an error. This case won't hit the code in store_sos.

The tests I made cover the case of a type change and verify that store_sos reported the log message. Do you think I should test with the filesingle plugin? Are there any specific changes to the filesingle plugin configuration I should try?

tom95858 commented 1 day ago

This has been superseded by PR# 1532. @nichamon, let's leave the asserts in because theoretically, those issues conditions should never happen unless the SHA256 happens to collide. I wonder if we can remove those checks altogether for efficiency's sake.