NSLS-II / nslsii

NSLS-II related devices
BSD 3-Clause "New" or "Revised" License
10 stars 22 forks source link

MNT: Create loggers.py and add Elasticsearch handler #49

Open ke-zhang-rd opened 5 years ago

ke-zhang-rd commented 5 years ago

Create loggers for carproto, bluesky and ophyd to Elasticsearch

codecov-io commented 5 years ago

Codecov Report

Merging #49 into master will decrease coverage by 0.09%. The diff coverage is 30.55%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master      #49     +/-   ##
=========================================
- Coverage   42.46%   42.37%   -0.1%     
=========================================
  Files          10       11      +1     
  Lines         584      616     +32     
=========================================
+ Hits          248      261     +13     
- Misses        336      355     +19
Impacted Files Coverage Δ
nslsii/common/ipynb/animation.py 18.75% <0%> (ø) :arrow_up:
nslsii/common/ipynb/info.py 55% <100%> (ø) :arrow_up:
nslsii/loggers.py 23.33% <23.33%> (ø)
nslsii/__init__.py 6.72% <50%> (+0.73%) :arrow_up:
nslsii/_version.py 44.4% <0%> (+1.8%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 0ff403e...94b3de8. Read the comment docs.

danielballan commented 5 years ago

To test, we need access to a beamline machine (so it can see elasticsearch.cs.nsls2.local). I would do the following.

ssh xf23id1-srv1
# Create and activate a new conda environment.
conda create -n test-logger python=3.6
conda activate test-logger
# Install the packages we need with pip, including a development version of nslsii.
pip install bluesky ophyd caproto ipython
git clone https://github.com/ke-zhang-rd/nslsii
cd nslsii
git checkout logger
pip install -e .
cd ..
# Start IPython.
ipython

Then in IPython:

import nslsii.loggers
from bluesky import RunEngine
RE = RunEngine({})
RE([])  # generates log messages that should go to Elasticsearch

Notice that I have not used bsui or any IPython profiles, so this will not interfere with any beamline operations.

Next steps:

danielballan commented 5 years ago

Also, we wondered during our Slack call about the difference between PUT and POST. If you use PUT, you must specific a unique ID for that log entry. Thus, PUT is idempotent, because PUT-ing it a second time with the same ID is redundent.) If you use POST, you leave it up to Elastic to generate a unique ID for you. Thus, POST is not idempotent, because POST-ing a second time will generate a second entry with a new, separate unique.

I learned that from this documentation page: https://www.elastic.co/guide/en/elasticsearch/guide/current/index-doc.html

It suggest that you should use PUT if and only if there is a "natural" obvious unique ID. Otherwise, you use POST and let Elastic generate an internal unique ID for you. For these log messages*, I think there is no natural unique ID, so we should use POST. Do you agree?

*There is one possible exception to this statement. The log messages from bluesky that say, "A Document has been emitted with uid=..." could use the document's uid as part of the unique ID for the log message. I am not sure that's the right thing to do here...maybe something to come back to later.

danielballan commented 5 years ago

The handler itself looks done to me. Next steps:

  1. A little thing: we should add requests to requirements.txt.
  2. We should get the index name (currently 'test') set to something real. I suggest moving all the code below the ElasticHandler definition into a function (configure_elastic, perhaps, to follow the pattern set by configure_base and configure_olog), and make that function expect a beamline (like 'csx') as a string. Then the index can be csx.bluesky as planned in the Kafka topics document.
  3. We need to set the levels of each of our handlers. I think bluesky should be set to DEBUG, and ophyd and caproto should be set to INFO (because DEBUG generates more information than we can afford to store, and because POST-ing all of that data to Elastic would cause data acquisition to slow down significantly).
  4. We should then set the levels of all the default handlers, such as bluesky.current_handler to 'WARNING'.
mrakitin commented 4 years ago

@jklynch, can it be useful and complimentary to your recent work in #80? @ke-zhang-rd, what was the blocking issue here?

jklynch commented 4 years ago

We decided this is out of scope for #80. I do not want to do any further logging development until we have had time to evaluate the new state of logging on the floor.

mrakitin commented 4 years ago

OK, I see.