Codecov Report

Merging #49 into master will decrease coverage by 0.09%. The diff coverage is 30.55%.

@@            Coverage Diff            @@
##           master      #49     +/-   ##
=========================================
- Coverage   42.46%   42.37%   -0.1%     
=========================================
  Files          10       11      +1     
  Lines         584      616     +32     
=========================================
+ Hits          248      261     +13     
- Misses        336      355     +19

Impacted Files	Coverage Δ
nslsii/common/ipynb/animation.py	`18.75% <0%> (ø)`	:arrow_up:
nslsii/common/ipynb/info.py	`55% <100%> (ø)`	:arrow_up:
nslsii/loggers.py	`23.33% <23.33%> (ø)`
nslsii/__init__.py	`6.72% <50%> (+0.73%)`	:arrow_up:
nslsii/_version.py	`44.4% <0%> (+1.8%)`	:arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 0ff403e...94b3de8. Read the comment docs.

danielballan commented 5 years ago

To test, we need access to a beamline machine (so it can see elasticsearch.cs.nsls2.local). I would do the following.

ssh xf23id1-srv1
# Create and activate a new conda environment.
conda create -n test-logger python=3.6
conda activate test-logger
# Install the packages we need with pip, including a development version of nslsii.
pip install bluesky ophyd caproto ipython
git clone https://github.com/ke-zhang-rd/nslsii
cd nslsii
git checkout logger
pip install -e .
cd ..
# Start IPython.
ipython

Then in IPython:

import nslsii.loggers
from bluesky import RunEngine
RE = RunEngine({})
RE([])  # generates log messages that should go to Elasticsearch

Notice that I have not used bsui or any IPython profiles, so this will not interfere with any beamline operations.

Next steps:

We should POST the logs from bluesky, ophyd, and caproto to separate indexes in elastic, replacing /test/ in the URL with specific names as outlined by Stuart (see https://github.com/NSLS-II/nslsii/issues/48#issuecomment-448363793). This means we'll need a separate ElasticHandler instance for each.
We should think about what log level we want for each library. The DEBUG logs from caproto, for instance, are probably too much information. They would require a lot of storage, and writing them imposes a significant performance overhead. But the DEBUG logs from bluesky are not quite so heavy and may be worth keeping.

danielballan commented 5 years ago

Also, we wondered during our Slack call about the difference between PUT and POST. If you use PUT, you must specific a unique ID for that log entry. Thus, PUT is idempotent, because PUT-ing it a second time with the same ID is redundent.) If you use POST, you leave it up to Elastic to generate a unique ID for you. Thus, POST is not idempotent, because POST-ing a second time will generate a second entry with a new, separate unique.

I learned that from this documentation page: https://www.elastic.co/guide/en/elasticsearch/guide/current/index-doc.html

It suggest that you should use PUT if and only if there is a "natural" obvious unique ID. Otherwise, you use POST and let Elastic generate an internal unique ID for you. For these log messages*, I think there is no natural unique ID, so we should use POST. Do you agree?

*There is one possible exception to this statement. The log messages from bluesky that say, "A Document has been emitted with uid=..." could use the document's uid as part of the unique ID for the log message. I am not sure that's the right thing to do here...maybe something to come back to later.

danielballan commented 5 years ago

The handler itself looks done to me. Next steps:

A little thing: we should add requests to requirements.txt.
We should get the index name (currently 'test') set to something real. I suggest moving all the code below the ElasticHandler definition into a function (configure_elastic, perhaps, to follow the pattern set by configure_base and configure_olog), and make that function expect a beamline (like 'csx') as a string. Then the index can be csx.bluesky as planned in the Kafka topics document.
We need to set the levels of each of our handlers. I think bluesky should be set to DEBUG, and ophyd and caproto should be set to INFO (because DEBUG generates more information than we can afford to store, and because POST-ing all of that data to Elastic would cause data acquisition to slow down significantly).
We should then set the levels of all the default handlers, such as bluesky.current_handler to 'WARNING'.

mrakitin commented 4 years ago

@jklynch, can it be useful and complimentary to your recent work in #80? @ke-zhang-rd, what was the blocking issue here?

jklynch commented 4 years ago

We decided this is out of scope for #80. I do not want to do any further logging development until we have had time to evaluate the new state of logging on the floor.

mrakitin commented 4 years ago

OK, I see.

NSLS-II / nslsii

MNT: Create loggers.py and add Elasticsearch handler #49

Codecov Report