DUNE-DAQ / listrev

list reversal -- example software that demonstrates how a user would create DAQModules that run in the App Fwk.
1 stars 2 forks source link

Some questions on the use of the commtest_gen script on the feature/iomanager branch #27

Closed bieryAtFnal closed 2 years ago

bieryAtFnal commented 2 years ago

Some questions that I have, based on the output below:

  1. If the --host option has a default of localhost, why does commtest_gen complain when I don't specify a --host option?
  2. Could the help text mention that two --host instances are required? And, that more than two has a special behavior?
  3. Why does a run with a configuration generated with --host localhost --host localhost fail? (There are messages in the logs about failures to revolve hostnames)
(dbt-pyvenv) [biery@mu2edaq13 rundir]$ commtest_gen --help
Usage: commtest_gen [OPTIONS] JSON_DIR

Options:
  -p, --partition-name TEXT       Name of the partition to use, for ERS and
                                  OPMON  [default: global]
  --host TEXT                     Hosts to run test programs on  [default:
                                  localhost]
  --ints-per-list INTEGER         Number of integers in the list  [default: 4]
  --wait-ms INTEGER               Number of ms to wait between list sends
                                  [default: 1000]
  --opmon-impl [json|cern|pocket]
                                  Info collector service implementation to use
                                  [default: json]
  --ers-impl [local|cern|pocket]  ERS destination (Kafka used for cern and
                                  pocket)  [default: local]
  --pocket-url TEXT               URL for connecting to Pocket services
                                  [default: 127.0.0.1]
  -h, --help                      Show this message and exit.
(dbt-pyvenv) [biery@mu2edaq13 rundir]$ commtest_gen basic_commtest
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│                                                                                                  │
│ /home/biery/dunedaq/28AprIOMgrTest/install/listrev/bin/commtest_gen:115 in <module>              │
│                                                                                                  │
│   112                                                                                            │
│   113 if __name__ == '__main__':                                                                 │
│   114 │   try:                                                                                   │
│ ❱ 115 │   │   cli(show_default=True, standalone_mode=True)                                       │
│   116 │   except Exception as e:                                                                 │
│   117 │   │   console.print_exception()                                                          │
│   118                                                                                            │
│ /home/biery/dunedaq/28AprIOMgrTest/dbt-pyvenv/lib/python3.8/site-packages/click/core.py:1130 in  │
│ __call__                                                                                         │
│                                                                                                  │
│   1127 │                                                                                         │
│   1128 │   def __call__(self, *args: t.Any, **kwargs: t.Any) -> t.Any:                           │
│   1129 │   │   """Alias for :meth:`main`."""                                                     │
│ ❱ 1130 │   │   return self.main(*args, **kwargs)                                                 │
│   1131                                                                                           │
│   1132                                                                                           │
│   1133 class Command(BaseCommand):                                                               │
│                                                                                                  │
│ /home/biery/dunedaq/28AprIOMgrTest/dbt-pyvenv/lib/python3.8/site-packages/click/core.py:1055 in  │
│ main                                                                                             │
│                                                                                                  │
│   1052 │   │   try:                                                                              │
│   1053 │   │   │   try:                                                                          │
│   1054 │   │   │   │   with self.make_context(prog_name, args, **extra) as ctx:                  │
│ ❱ 1055 │   │   │   │   │   rv = self.invoke(ctx)                                                 │
│   1056 │   │   │   │   │   if not standalone_mode:                                               │
│   1057 │   │   │   │   │   │   return rv                                                         │
│   1058 │   │   │   │   │   # it's not safe to `ctx.exit(rv)` here!                               │
│                                                                                                  │
│ /home/biery/dunedaq/28AprIOMgrTest/dbt-pyvenv/lib/python3.8/site-packages/click/core.py:1404 in  │
│ invoke                                                                                           │
│                                                                                                  │
│   1401 │   │   │   echo(style(message, fg="red"), err=True)                                      │
│   1402 │   │                                                                                     │
│   1403 │   │   if self.callback is not None:                                                     │
│ ❱ 1404 │   │   │   return ctx.invoke(self.callback, **ctx.params)                                │
│   1405 │                                                                                         │
│   1406 │   def shell_complete(self, ctx: Context, incomplete: str) -> t.List["CompletionItem"]:  │
│   1407 │   │   """Return a list of completions for the incomplete value. Looks                   │
│                                                                                                  │
│ /home/biery/dunedaq/28AprIOMgrTest/dbt-pyvenv/lib/python3.8/site-packages/click/core.py:760 in   │
│ invoke                                                                                           │
│                                                                                                  │
│    757 │   │                                                                                     │
│    758 │   │   with augment_usage_errors(__self):                                                │
│    759 │   │   │   with ctx:                                                                     │
│ ❱  760 │   │   │   │   return __callback(*args, **kwargs)                                        │
│    761 │                                                                                         │
│    762 │   def forward(                                                                          │
│    763 │   │   __self, __cmd: "Command", *args: t.Any, **kwargs: t.Any  # noqa: B902             │
│                                                                                                  │
│ /home/biery/dunedaq/28AprIOMgrTest/install/listrev/bin/commtest_gen:45 in cli                    │
│                                                                                                  │
│    42 │   │   raise RuntimeError(f"Directory {json_dir} already exists")                         │
│    43 │                                                                                          │
│    44 │   if len(host) < 2:                                                                      │
│ ❱  45 │   │   raise RuntimeError("More than one host must be tested!")                           │
│    46 │                                                                                          │
│    47 │   console.log('Loading listrev config generator')                                        │
│    48 │   from listrev import listrevapp_gen                                                     │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: More than one host must be tested!
(dbt-pyvenv) [biery@mu2edaq13 rundir]$ 
(dbt-pyvenv) [biery@mu2edaq13 rundir]$ 
(dbt-pyvenv) [biery@mu2edaq13 rundir]$ 
(dbt-pyvenv) [biery@mu2edaq13 rundir]$ commtest_gen --host localhost --host localhost localhost_commtest
[11:19:23] Loading listrev config generator                                                                               commtest_gen:47
[11:19:23] Adding endpoint to_reverse, app listrev_app_rv_localhost, direction Direction.IN                             conf_utils.py:202
           Adding endpoint reversed, app listrev_app_rv_localhost, direction Direction.OUT                              conf_utils.py:202
           Adding endpoint original, app listrev_app_rv_localhost, direction Direction.IN                               conf_utils.py:202
           Adding endpoint reversed, app listrev_app_rv_localhost, direction Direction.IN                               conf_utils.py:202
           Adding endpoint original, app listrev_app_g_1_localhost, direction Direction.OUT                             conf_utils.py:202
           Adding endpoint to_reverse, app listrev_app_g_1_localhost, direction Direction.OUT                           conf_utils.py:202
           dict_items([('to_reverse', [{'app': 'listrev_app_rv_localhost', 'endpoint': to_reverse/lr.input}, {'app':    conf_utils.py:205
           'listrev_app_g_1_localhost', 'endpoint': to_reverse/rdlg.q2}]), ('reversed', [{'app':                                         
           'listrev_app_rv_localhost', 'endpoint': reversed/lr.output}, {'app': 'listrev_app_rv_localhost', 'endpoint':                  
           reversed/lrv.reversed_data_input}]), ('original', [{'app': 'listrev_app_rv_localhost', 'endpoint':                            
           original/lrv.original_data_input}, {'app': 'listrev_app_g_1_localhost', 'endpoint': original/rdlg.q1}])])                     
           Processing to_reverse with defined endpoints [{'app': 'listrev_app_rv_localhost', 'endpoint':                conf_utils.py:207
           to_reverse/lr.input}, {'app': 'listrev_app_g_1_localhost', 'endpoint': to_reverse/rdlg.q2}]                                   
           Direction is Direction.IN                                                                                    conf_utils.py:218
           Direction is Direction.OUT                                                                                   conf_utils.py:218
           Connection to_reverse, Network                                                                               conf_utils.py:159
           Processing reversed with defined endpoints [{'app': 'listrev_app_rv_localhost', 'endpoint':                  conf_utils.py:207
           reversed/lr.output}, {'app': 'listrev_app_rv_localhost', 'endpoint': reversed/lrv.reversed_data_input}]                       
           Direction is Direction.OUT                                                                                   conf_utils.py:218
           Direction is Direction.IN                                                                                    conf_utils.py:218
           Connection reversed, SPSC Queue                                                                              conf_utils.py:151
           Processing original with defined endpoints [{'app': 'listrev_app_rv_localhost', 'endpoint':                  conf_utils.py:207
           original/lrv.original_data_input}, {'app': 'listrev_app_g_1_localhost', 'endpoint': original/rdlg.q1}]                        
           Direction is Direction.IN                                                                                    conf_utils.py:218
           Direction is Direction.OUT                                                                                   conf_utils.py:218
           Connection original, Network                                                                                 conf_utils.py:159
           module, name= lr, input, endpoint.external_name=to_reverse, endpoint.direction=Direction.IN                  conf_utils.py:293
           module, name= lr, output, endpoint.external_name=reversed, endpoint.direction=Direction.OUT                  conf_utils.py:293
           module, name= lrv, original_data_input, endpoint.external_name=original, endpoint.direction=Direction.IN     conf_utils.py:293
           module, name= lrv, reversed_data_input, endpoint.external_name=reversed, endpoint.direction=Direction.IN     conf_utils.py:293
           module, name= rdlg, q1, endpoint.external_name=original, endpoint.direction=Direction.OUT                    conf_utils.py:293
           module, name= rdlg, q2, endpoint.external_name=to_reverse, endpoint.direction=Direction.OUT                  conf_utils.py:293
           Generating system init command                                                                               conf_utils.py:587
           Generating system conf command                                                                               conf_utils.py:587
           Generating system start command                                                                              conf_utils.py:587
           Generating system stop command                                                                               conf_utils.py:587
           Generating system pause command                                                                              conf_utils.py:587
           Generating system resume command                                                                             conf_utils.py:587
           Generating system scrap command                                                                              conf_utils.py:587
           Generating boot json file                                                                                    conf_utils.py:601
────────────────────────────────────────────────────────── JSON file creation ───────────────────────────────────────────────────────────
           make_app_json for app listrev_app_rv_localhost                                                               conf_utils.py:572
           make_app_json for app listrev_app_g_1_localhost                                                              conf_utils.py:572
           System configuration generated in directory 'localhost_commtest'                                             conf_utils.py:627
           Listrev app config generated in localhost_commtest                                                            commtest_gen:109
[11:19:23] Generating metadata file                                                                                        metadata.py:10
(dbt-pyvenv) [biery@mu2edaq13 rundir]$ nanorc localhost_commtest boot init conf start 101 wait 5 stop scrap terminate
╭──────────────────────────────────────────────────────────────────────────╮
│                              Shonky NanoRC                               │
│  This is an admittedly shonky nano RC to control DUNE-DAQ applications.  │
│    Give it a command and it will do your biddings,                       │
│    but trust it and it will betray you!                                  │
│  Use it with care, user!                                                 │
╰──────────────────────────────────────────────────────────────────────────╯
FSM available states: ['none', 'booted', 'initialised', 'configured', 'running', 'paused', 'error']
FSM available transitions: {'pause', 'start', 'resume', 'terminate', 'stop', 'scrap', 'conf', 'boot', 'init'}
[11:20:46] INFO     Using partition: "global"                                                                               cfgmgr.py:169
Extra commands are []
           INFO     Using filelogbook                                                                                          core.py:85
Running on the apparatus localhost_commtest:
╭────────────────────────╮
│ localhost_commtest     │
│ └── localhost_commtest │
╰────────────────────────╯
           INFO     localhost_commtest received command 'boot'                                                         statefulnode.py:94
           INFO     Propagating to children nodes in the figured out order: ['localhost_commtest']                    statefulnode.py:102
           INFO     Subsystem localhost_commtest is booting                                                                   node.py:117
  # apps started            ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:01
  listrev_app_g_1_localhost ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:01
  listrev_app_rv_localhost  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:01
[11:20:47] INFO     ResponseListener Flask lives on PID: 133178                                                             appctrl.py:70
           INFO     Application listrev_app_g_1_localhost booted                                                               node.py:38
           INFO     Application listrev_app_rv_localhost booted                                                                node.py:38
                                       localhost_commtest apps                                        
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ name                              ┃ state          ┃ host      ┃ pings ┃ last cmd ┃ last succ. cmd ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ localhost_commtest                │ booted         │           │       │          │                │
│ └── localhost_commtest            │ booted         │           │       │          │                │
│     ├── listrev_app_g_1_localhost │ booted - alive │ mu2edaq13 │ True  │ None     │ None           │
│     └── listrev_app_rv_localhost  │ booted - alive │ mu2edaq13 │ True  │ None     │ None           │
└───────────────────────────────────┴────────────────┴───────────┴───────┴──────────┴────────────────┘
           INFO     localhost_commtest received command 'init'                                                         statefulnode.py:94
           INFO     Propagating to children nodes in the figured out order: ['localhost_commtest']                    statefulnode.py:102
           INFO     Sending init to the subsystem localhost_commtest                                                          node.py:212
           INFO     Sending init to listrev_app_g_1_localhost (http://mu2edaq13:3334)                                      appctrl.py:234
           INFO     Ack: <Response [202]>                                                                                  appctrl.py:242
           INFO     Sending init to listrev_app_rv_localhost (http://mu2edaq13:3333)                                       appctrl.py:234
           INFO     Ack: <Response [202]>                                                                                  appctrl.py:242
[11:20:48] INFO     Received reply from listrev_app_g_1_localhost to init                                                  appctrl.py:261
           ERROR    while trying to "init" "listrev_app_g_1_localhost"                                                 statefulnode.py:89
           INFO     Received reply from listrev_app_rv_localhost to init                                                   appctrl.py:261
           ERROR    while trying to "init" "listrev_app_rv_localhost"                                                  statefulnode.py:89
           ERROR    while trying to "init" "localhost_commtest"                                                        statefulnode.py:89
           ERROR    while trying to "init" "localhost_commtest"                                                        statefulnode.py:89
           WARNING  NanoRC context cleanup: Terminating RC before exiting                                                      cli.py:256
           INFO     ResponseListener: Flask joined                                                                         appctrl.py:105
           INFO     listrev_app_g_1_localhost process exited with exit code 255                                              sshpm.py:147
           INFO     listrev_app_rv_localhost process exited with exit code 255                                               sshpm.py:147
(dbt-pyvenv) [biery@mu2edaq13 rundir]$ egrep 'ERROR|WARNING' log*
log_listrev_app_g_1_localhost_3334.txt:2022-Apr-28 11:20:47,989 ERROR [std::vector<std::__cxx11::basic_string<char> > dunedaq::utilities::get_ips_from_hostname(std::__cxx11::string) at /home/biery/dunedaq/28AprIOMgrTest/sourcecode/utilities/src/Resolver.cpp:23] The hostname {host_listrev_app_rv_localhost} could not be resolved: Name or service not known
log_listrev_app_g_1_localhost_3334.txt:2022-Apr-28 11:20:47,991 ERROR [void dunedaq::cmdlib::CommandFacility::handle_command(const cmdobj_t&, dunedaq::cmdlib::cmd::CommandReply) at /__w/daq-release/daq-release/dev-c7/sourcecode/cmdlib/src/CommandFacility.cpp:64] Execution of command failed: Caught ers::Issue
log_listrev_app_g_1_localhost_3334.txt: was caused by: 2022-Apr-28 11:20:47,991 ERROR [virtual void dunedaq::listrev::RandomDataListGenerator::init(const json&) at /home/biery/dunedaq/28AprIOMgrTest/sourcecode/listrev/plugins/RandomDataListGenerator.cpp:64] The q1 queue was not successfully created. DAQModule: rdlg
log_listrev_app_g_1_localhost_3334.txt: was caused by: 2022-Apr-28 11:20:47,990 ERROR [virtual void dunedaq::ipm::ZmqSenderImpl::connect_for_sends(const json&) at /home/biery/dunedaq/28AprIOMgrTest/sourcecode/ipm/plugins/ZmqSenderImpl.hpp:86] An exception occured while calling resolve connection_string on the ZMQ send socket: Unable to resolve connection_string (connection_string: tcp://{host_listrev_app_rv_localhost}:12347)
log_listrev_app_rv_localhost_3333.txt:2022-Apr-28 11:20:47,997 ERROR [std::vector<std::__cxx11::basic_string<char> > dunedaq::utilities::get_ips_from_hostname(std::__cxx11::string) at /home/biery/dunedaq/28AprIOMgrTest/sourcecode/utilities/src/Resolver.cpp:23] The hostname {host_listrev_app_rv_localhost} could not be resolved: Name or service not known
log_listrev_app_rv_localhost_3333.txt:2022-Apr-28 11:20:47,998 ERROR [void dunedaq::cmdlib::CommandFacility::handle_command(const cmdobj_t&, dunedaq::cmdlib::cmd::CommandReply) at /__w/daq-release/daq-release/dev-c7/sourcecode/cmdlib/src/CommandFacility.cpp:64] Execution of command failed: Caught ers::Issue
log_listrev_app_rv_localhost_3333.txt:  was caused by: 2022-Apr-28 11:20:47,998 ERROR [virtual void dunedaq::listrev::ListReverser::init(const json&) at /home/biery/dunedaq/28AprIOMgrTest/sourcecode/listrev/plugins/ListReverser.cpp:52] The input queue was not successfully created. DAQModule: lr
log_listrev_app_rv_localhost_3333.txt:  was caused by: 2022-Apr-28 11:20:47,998 ERROR [virtual void dunedaq::ipm::ZmqReceiver::connect_for_receives(const json&) at /home/biery/dunedaq/28AprIOMgrTest/sourcecode/ipm/plugins/ZmqReceiver.cpp:70] An exception occured while calling resolve connection_string on the ZMQ receive socket: Unable to resolve connection_string (connection_string: tcp://{host_listrev_app_rv_localhost}:12346)
eflumerf commented 2 years ago

I have updated the help text in 4d983da, and made the default 2 copies of localhost.

I believe the run-time errors you experienced could have been caused by not having the feature/iomanager version of NanoRC; please check and confirm.

bieryAtFnal commented 2 years ago

Thanks for the help text updates.

Yes, I was using the wrong version of nanorc, but that was what I got with the existing software area instructions. I have found that one needs to run dbt-workarea-env before pip install -U ./nanorc in order for the pip install to work correctly. May I try modifying the instructions to see if that helps? (in related news, it looks like a git merge in nanorc from develop into feature/iomanager could be useful)

eflumerf commented 2 years ago

Please and thank you! I have updated the feature/iomanager branch of nanorc

bieryAtFnal commented 2 years ago

Done