odin-detector / odin-data

DAQ software libraries for capturing and storing data from parallel detector systems
https://odin-detector.github.io/odin-data/
Apache License 2.0
8 stars 11 forks source link

Improve frameReceiver initial configuration #335

Closed timcnicholls closed 6 months ago

timcnicholls commented 6 months ago

This PR improves the frameReceiver initial configuration behaviour to avoid early default configuration of the frame ready and frame release notification endpoints, and to make the overall behaviour more consistent with the frameProcessor.

The ready and release endpoints no longer get bound to default endpoints before any configuration file has been parsed. This avoids a potential clash with a second instance on the same host and allows multiple frame receivers to be started with minimal command-line options prior to client configuration. Note that this change requires the frame_ready_endpoint and frame_release_endpoint parameters to be explicitly configured.

Internally, the default FR configuration object has been delegated from the main application to the FrameReceiverController instance and no longer sets up default values for endpoints.

timcnicholls commented 6 months ago

@gdy - wondering if this PR would also be a good opportunity to fix #323 ??

GDYendell commented 6 months ago

wondering if this PR would also be a good opportunity to fix https://github.com/odin-detector/odin-data/issues/323 ??

I have just tried this against 1.9.0 and it doesn't seem to be a problem. If the file doesn't exist then it errors with a sensible message (which I think is reasonable and better than failing and carrying on):

❯ /dls_sw/prod/tools/RHEL7-x86_64/odin-data/1.9.0+dls2/prefix/bin/frameReceiver --config not.json
0 [0x7f1327160b00] ERROR FR.App null - Unable to open configuration file not.json for parsing
❯ /dls_sw/prod/tools/RHEL7-x86_64/odin-data/1.9.0+dls2/prefix/bin/frameProcessor --config not.json
0 [0x7ff64933eb00] ERROR FP.App null - Unable to open configuration file not.json for parsing

If file exists and is empty then it just doesn't configure anything, which is probably OK. It would be more complicated to test that we didn't manage to find any valid configs in the file.

timcnicholls commented 6 months ago

I have just tried this against 1.9.0 and it doesn't seem to be a problem. If the file doesn't exist then it errors with a sensible message (which I think is reasonable and better than failing and carrying on):

Hmm, that's odd. I can't directly test 1.9.0 on my system as Ubuntu needs later PRs (e.g. #316) to fix boost placeholder issues. But if I run the FP build from the head of master I still get tracebacks:

$ bin/frameProcessor --config not.json
0 [0x7f8ae81b5e80] INFO FP.App null - frameProcessor version 1.9.0-43-g36284dd starting up
2 [0x7f8ae81b5e80] ERROR FP.App null - Caught unhandled exception in FrameProcessor, application will terminate: Incorrect or empty JSON configuration file specified
terminate called after throwing an instance of 'OdinData::OdinDataException'
  what():  Incorrect or empty JSON configuration file specified
Caught signal 6 (SIGABRT)
stack trace:
bin/frameProcessor(_ZN8OdinData17print_stack_traceEP8_IO_FILEj+0x11c)[0x55b3551ca5a6]
bin/frameProcessor(_ZN8OdinData13abort_handlerEiP9siginfo_tPv+0xfa)[0x55b3551ca717]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f8ae87e0520]
/lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7f8ae88349fc]
/lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7f8ae87e0476]
/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7f8ae87c67f3]
/lib/x86_64-linux-gnu/libstdc++.so.6(+0xa2b9e)[0x7f8ae8a89b9e]

I agree it should terminate, not try and carry on though. The FR does this - prints an error message and shuts down cleanly.

GDYendell commented 6 months ago

Yes I do find the same in my dev version, but for some reason the version we have in prod doesn't...

In any case, this

what(): Incorrect or empty JSON configuration file specified

is an exception we explicitly raise here. So I think we just need to either add a catch for OdinDataException and print a slightly nicer message that isn't "unhandled", or log an error, return 1, and handle the return code from run().

timcnicholls commented 6 months ago

Ah yes, I was just looking at the same thing on your 1.9.0+dls2 version on /dls_sw . That version predates #317 when the old ini file config was removed so your invocation above is erroring cleanly on failing to parse an old config file, not JSON.

I agree, we should catch the OdinDataException and print a cleaner message. As you say, neither FR nor FP currently change the exit code based on failures in app->run() in such circumstances . I'm happy to clean this up on both if you like? Might be good to do on this PR?

GDYendell commented 6 months ago

Yep I think that would be good to do here.

GDYendell commented 6 months ago

Looks good. I am happy to merge this and close #323 . Could we then have a 1.9.1 release?

timcnicholls commented 6 months ago

Thanks. I was thinking of a 1.10.0 release as it's a feature and requires config changes (i.e. FR endpoints in config file). If possible I'd also like to have PR #334 approved and rolled into the next release.