IRC-SPHERE / HyperStream

HyperStream
https://irc-sphere.github.io/HyperStream/
MIT License
13 stars 5 forks source link

Tool Audit #27

Open tdiethe opened 6 years ago

tdiethe commented 6 years ago

We currently have a long list of (poorly documented) tools:

aggregate aggregate_into_dict_and_apply aggregate_plate aligned_merge aligning_window apply asset_plate_generator asset_splitter asset_writer clock component component_filter component_set dict_argmax dict_values_to_const histogram_from_list histograms_to_csv index_of index_of_by_stream jsonify list_dict_mean list_dict_sum list_length list_mean list_sum meta_instance meta_instance_from_list percentiles_from_list percentiles_to_csv plate_sink product relative_apply relative_apply2 relative_window sink slice sliding_apply sliding_listify sliding_sink sliding_window splitter splitter_from_stream splitter_of_dict splitter_of_list splitter_time_aware splitter_time_aware_from_stream splitter_with_unlist stream_broadcaster stream_broadcaster_from_stream

We should attach usage numbers to each of these for sphere-hyperstream, and determine whether any should be pruned/renamed. For those that the remain the documentation should be improved.

perellonieto commented 6 years ago

I see in the BaseTool https://github.com/IRC-SPHERE/HyperStream/blob/master/hyperstream/tool/base_tool.py that at least a logging message in the debug level is generated every time that one of the tools is created.

Would it be enough setting the logging level at least to DEBUG and grep the standard output to see how many occurrences are of each tool?

Would it be better to augment the BaseTool in a way that the count is stored in the MongoDB or memory?

tdiethe commented 6 years ago

Interesting - hadn't thought of automating it - was just thinking to go through all of the SPHERE workflow scripts by hand. Trouble is you'd have to execute every workflow to be sure?