vsoch / watchme

Reproducible watchers for research
https://vsoch.github.io/watchme/
Mozilla Public License 2.0
851 stars 32 forks source link

[WIP] Add/exporters #32

Open vsoch opened 5 years ago

vsoch commented 5 years ago

This is a work in progress to add (optional) exporters for a watcher. The PR does the following:

I still need to:

vsoch commented 5 years ago

Updates - I have done most of the work to get pushgateway working and documented as an exporter. What remains to be done is more thinking about push. Specifically, I think it should be used to push complete data (meaning commits, dates, and results) to some exporter. This does mean, however, that some exporters that don't handle dictionary / json (akin to pushgateway) wouldn't be best suited. However, pushgateway works fine when coinciding with a run, so the user probably wouldn't use push.

@SCHKN I'll do more thinking about push (don't test / use it, it's not written) and in the meantime, here are docs for overall exporters and pushgateway - could you take a look?

watchme-exporters.pdf watchme-exporter-pushgateway.pdf

Also, I'm curious about your feedback to #33. I like the idea of exporters, but I'm worried if there aren't a significant of other ones we can think of (that people would actually want to use) adding them might be overkill (feature bloat). On the other hand, if there are (other?) reasonable places that are hard to stream data to, it would be super cool if watchme could make that easy. I'm not super experienced with streaming data but I'm going to poke around the internets tonight :) I'll do more work on push later this week and update you when there is something there too.

vsoch commented 5 years ago

Ah and before I forget - @SCHKN we need to have a way to know if an exporter fails (and then we can return failed value and disable). I'm looking into that now, along with why / if the pushgateway can handle a dictionary (any reason why not?)

vsoch commented 5 years ago

okay, I've added the push command for pushgateway. Unfortunately, since the two values are already added (or something else?) it returns an error.

vanessa@vanessa-ThinkPad-T460s:~/Documents/Dropbox/Code/Python/watchme$ watchme push temp-watcher task-luxembourg exporter-pushgateway result-0.txt
git log --all --oneline --pretty=tformat:"%H" --grep "ADD results" 3bc9033184776d6ccce4eb1dd76b00bea7ae05d8..da08a65478216d5afeb4362066d1c28ecd178173 -- task-luxembourg/result-0.txt
ERROR An exception occurred while trying to export data using exporter-pushgateway

specifically:

    105     def labels(self, *labelvalues, **labelkwargs):

~/anaconda3/lib/python3.6/site-packages/prometheus_client/registry.py in register(self, collector)
     27                 raise ValueError(
     28                     'Duplicated timeseries in CollectorRegistry: {0}'.format(
---> 29                         duplicates))
     30             for name in names:
     31                 self._names_to_collectors[name] = collector

ValueError: Duplicated timeseries in CollectorRegistry: {'exporter:pushgateway'}

I think I've done some good changes in this PR (for example, moving the task functions to be with the client) but I'm not convinced this is a good feature to add overall. Let's please discuss!

SCHKN commented 5 years ago

Thank you @vsoch for the amazing work. I'll try to go through it as soon as possible and give you some feedback on it to move further.

vsoch commented 5 years ago

Just resolved merged conflicts - hopefully I didn't miss anything but please poke if I did! (and I'll watch the CI too).

vsoch commented 5 years ago

Hey! I just had a bit of a nutty idea that I think has feet - one that would allow for exporting results to the push gateway without adding an entire new exporters group (which feels like it has a lot of duplication for watchers, and makes the library very heavy).

What if we just added a Prometheus watcher group, and it would have a task to (given some other watcher and tasks as variables) export some derivation of results and push to the gateway? The only real difference would be that we collect the users preferences for the watchers and tasks from variables, and the results would be read from get or from the present files directly. If a user wanted to have a run coincide with the push, they would just run the commands to run the two tasks one after the other. What do you think? It definitely brings up some interesting ideas to about orchestration or interaction between tasks :-)

vsoch commented 5 years ago

@SCHKN see here - https://github.com/vsoch/watchme/issues/38 I wrote out a dummy example of how it might look!