NSLS-II / Bug-Reports

Unified issue-tracker for bugs in the data acquisition, management, and analysis software at NSLS-II
BSD 3-Clause "New" or "Revised" License
2 stars 5 forks source link

Log timeout problem #126

Open cmazzoli opened 8 years ago

cmazzoli commented 8 years ago

@tacaswell this is the issue I was trying to recall about...


In [612]: RE(ct())
Transient Scan ID: 66210
Persistent Unique Scan ID: '8e485e10-f098-4f9d-b02f-163518052b16'
logit Flat field 66210 @ 778eV
/home/xf23id1/conda_envs/collection/lib/python3.4/site-packages/bluesky/run_engine.py:1821: UserWarning: A ReadTimeout(ReadTimeoutError("HTTPSConnectionPool(host='xf23id1-log.cs.nsls2.local', port=8181): Read timed out. (read timeout=30)",),) was raised during the processing of a start Document. The error will be ignored to avoid interrupting data collection. To investigate, set RunEngine.ignore_callback_exceptions = False and run again.
  "and run again." % (exc, name.name))

Any idea of the reason why and the impact on our data? Please let me know.

tacaswell commented 8 years ago

That is olog timing out (it used to completely hang the collection process). Data collection should go fine, but you will not have an olog entry to go with it.

cmazzoli commented 8 years ago

Hi Tom @tacaswell, thanks. I just would like to add that:

I hope that this might help in finding the cause. Please let me know.

Last but not least: are the missing entries lost or can they be recovered?

Thanks.

tacaswell commented 8 years ago

There is nothing in the olog entry that is not also in the runstart.

If you just pass the start document to the logbook callback it will re-create you missing entry.

danielballan commented 8 years ago

This has come up a couple times; I should write a working example of this. Leave this issue open until I get around to that.

cmazzoli commented 8 years ago

Hi guys (@tacaswell @danielballan), this problem is repeatedly plaguing our operations.

I appreciate very much the warning instead of a measurement crash. I would like to ask about:

Please let me know what you think about.

danielballan commented 8 years ago

attn @shroffk Would you be willing to get more involved in the integration between the NSLS-II data acquisition stack and Olog? We could use some more hands on this.

tacaswell commented 8 years ago

See https://github.com/NSLS-II/bluesky/blob/master/bluesky/callbacks/olog.py#L43 You can customize the formatting of olog entry however you want via jinja2 templates.

There should be a function called configured_logbook_func created in your 00-startup.py (which is the result of calling logbook_cb_factory which you can call with

configured_logbook_func('start', hdr.start)

and it will create the missing entry (or create a duplicate entry if it is already there).

cmazzoli commented 8 years ago

Hi Tom, thanks. I checked the 00-startup but it doesn't contain anything about the logbook. Instead we have 01-olog-integration and it contains something similar to what you report (logbook_cb_factory). You can find it under: /home/xf23id1/xf23id1_profiles/startup

Do you mean that if I run configured_logbook_func() from Bluesky it should rebuild the missing scans? Sorry but I am not sure understanding what this function is supposed to do as it just recalls SimpleOlogClient()... Thanks!

danielballan commented 8 years ago

That is the correct file. If you know the missing scan ids, do something like:

for scan_id in missing_scan_ids:
    hdr = db[scan_id]
    configured_logbook_func('start', hdr.start)

It is theoretically possible to detect missing entries automatically, but that would take some effort. Maybe @shroffk has the bandwidth for that?