IBMStreams / streamsx.utility

(Incubation) Contains utilities for IBM Streams
Other
1 stars 15 forks source link

Add utility function to set thread name from within SPL and add a tuple counting utility operator #37

Closed jlkaus closed 7 years ago

jlkaus commented 7 years ago

Also, add a .gitignore file to avoid seeing toolkit.xml changes, and remove toolkit.xml from the repository, since it gets re- generated every build.

jlkaus commented 7 years ago

I put the AdaptiveTupleCounter in here too. Unrelated to the thread naming, but added basically at the same time and tested together.

scotts commented 7 years ago

I'm a little unclear on what AdaptiveTupleCounter does. Can you provide a brief explanation?

jlkaus commented 7 years ago

AdaptiveTupleCounter is just a simple way to count tuples on a stream, and calculates/traces throughputs for the stream. It tries to be pretty lightweight, and only incur clock checks, throughput calculations, and tracing costs very infrequently, but still tries to be fairly regular (time-wise) in how often the metric is traced, by roughly adapting to the current tuple rate.

While the internal streams metrics stuff can get me roughly the same data, its a lot more work. This, I don't have to think about, and the metrics can be pulled out of the PE traces live, or later, and graphed over time, etc., without having to muck about with periodic capturestates and XML parsing, let alone keeping a manual eye on the web console, or trying to use some interface to the view service. I've found this pretty useful in my exploratory work improving performance on several different applications, and it was suggested I put it in here, rather than keep dragging it around manually.

scotts commented 7 years ago

Agreed that is useful. Can you add an spldoc comment explaining this? Please also include that users can find the metrics in the application logs at the error level. After we merge, I'd also like to open an issue about printing/tracing level. But I don't want that issue to prevent the initial merge.

jlkaus commented 7 years ago

SPLDOC should be added now, with an example use, etc. Agree on the tracing level thing, I think. In my particular case, DEBUG or INFO has too much other stuff on to keep good performance overall, so I picked ERROR, but it would be nice to be configurable somehow. Wasn't sure how to chose a trace level dynamically like that.

scotts commented 7 years ago

Excellent, thanks. And agreed on difficulty with DEBUG and INFO. The rule I adhere to - but not all operators and toolkits do - is that anything logged at the INFO level must not be on the standard data path. That way INFO should be usable in most applications without a performance hit, and without generating an unmanageable amount of logging info. We may be able to get around this by just applying the logging level to that operator only.