Seagate / cortx-hare

CORTX Hare configures Motr object store, starts/stops Motr services, and notifies Motr of service and device faults.
https://github.com/Seagate/cortx
Apache License 2.0
13 stars 80 forks source link

Problem: collection of Hare forensics is not automated #801

Closed vvv closed 4 years ago

vvv commented 4 years ago

Solution: add hctl reportbug command that will collect the data required to investigate Hare failures.

Data to collect:

The data collected with hctl reportbug may overlap with what Mero's m0reportbug collects, let us not worry about that for now.

Related issue: #781

azheregelya commented 4 years ago

closed via commit 5609219859ac587355aef6cb986580227c1b5e2f

d7db9192-e4d5-44d3-9604-8f1dd29777a2 commented 4 years ago

closed via merge request #812

typeundefined commented 4 years ago

@vvv

To make hctl reportbug command, we only need to

  • create utils/hare-reportbug script with # :help: <help message>;
  • [optionally] update hctl's help message.

Addition:

vvv commented 4 years ago

mentioned in merge request #812

azheregelya commented 4 years ago

MR with solution is ready for review: http://gitlab.mero.colo.seagate.com/mero/hare/merge_requests/491

vvv commented 4 years ago

@konstantin.nekrasov Or you mean that no fancy CLI subcommands parsing is needed at all? Hmm... :thinking_face:

Indeed. I see it now. @konstantin.nekrasov Thanks for disrupting my attempts at over-engineering! (And kudos to @dmitriy.chumak, who wrote that nice and simple hctl implementation!)

To make hctl reportbug command, we only need to

I'm sure @azheregelya has figured this out already. Sorry for the noise.

vvv commented 4 years ago

If we use argparse subparsers instead, all the hctl subcommands will be required to be written in Python only and there will be no such easy way to extend the CLI API.

Good point.

Though it's not necessary for hctl subcommands to be implemented in Python. They can call os.system(). ```python #!/usr/bin/env python import argparse import os import sys def parse_args(argv): p = argparse.ArgumentParser( description='Example program using subcommands.', usage='%(prog)s COMMAND [ARGUMENT]...') subs = p.add_subparsers(title='Supported commands') p_foo = subs.add_parser('foo', help='Prints foo.') p_foo.add_argument('--length', type=int, default=0, help='Length of foo.') p_foo.set_defaults(command=cmd_foo) p_bar = subs.add_parser('bar', help='Prints bar.') p_bar.add_argument('--capacity', type=int, help='Bar capacity.') p_bar.set_defaults(command=cmd_bar) return p.parse_args(argv) def cmd_foo(args): os.system(f'set -x; echo foo length={args.length}') def cmd_bar(args): os.system(f'set -x; echo bar capacity={args.capacity}') def main(argv=None): args = parse_args(argv) args.command(args) if __name__ == '__main__': sys.exit(main()) ``` ``` $ ./example.py --help usage: example.py COMMAND [ARGUMENT]... Example program using subcommands. optional arguments: -h, --help show this help message and exit Supported commands: {foo,bar} foo Prints foo. bar Prints bar. $ ./example.py foo --length 18 + echo foo length=18 foo length=18 $ ./example.py bar + echo bar capacity=None bar capacity=None ```


@konstantin.nekrasov To be honest, I doubt that Bash implementation of hctl CLI tool (the dispatcher, calling subcommand implementations) will be simpler than Python implementation of the same.

AFAIK, getopt command doesn't let one express CLI with subcommands. I implemented CLI with subcommands for Mero's and Halon's helper scripts. This is doable, but I didn't enjoy the process or the resulting code too much. Bash doesn't go to great lengths to make developer's experience pleasant in this regard. Command-line arguments parsers provided by Python, Go, and Rust are much more developer-friendly in comparison.

But. If you would rather implement hctl in Bash — go ahead. It's your patch after all, not mine. :slightly_smiling_face:

typeundefined commented 4 years ago

@vvv

[Konstantin] rewrites hctl in Python;

Do we really need that piece of work? The current approach allows writing extensible hctl interface just following script naming convention. If we use argparse subparsers instead, all the hctl subcommands will be required to be written in Python only and there will be no such easy way to extend the CLI API. I can imagine a hand-made framework that dynamically searches for all hctl subcommand implementations and registers them within hctl's subparsers but I doubt we need it now. Bash-written hctl is dumb but pretty easy.

vvv commented 4 years ago

It makes sense to use hctl- prefix for the file names of hctl subcommand implementations. This would make it easy to discern utils/hctl-* scripts from the rest of utils/* and highlight their relation to hctl CLI tool.

Compare with git-* scripts in the Git source repository.

cc @dmitriy.chumak

vvv commented 4 years ago

I propose that we distribute the work like this:

@konstantin.nekrasov I guess subcommands of argparse would be a good fit for hctl CLI. What do you think?

typeundefined commented 4 years ago

My proposal: allow some of the hctl plugins (utils/hare-* scripts) be written in something non-bash. In my particular case (see my branch) I'm going to:

  1. keep the hctl plugin as a whole Python module (i.e. something you can install into the current virtualenv)
  2. make setuptools to generate the CLI wrapper for the module (similar to how we get hax executable in HaX, see setup.py file)
  3. add "proxy" script to utils/hare-* that actually forwards to the executable from [2].
vvv commented 4 years ago

Solution: add hctl reportbug command that will collect the data required to investigate Hare failures.

@konstantin.nekrasov was going to extend hctl CLI; see #794. @azheregelya, you may want to contact Konstantin and agree on how to extend the CLI. (Could it be the right time to rewrite hctl in Python?)

vvv commented 4 years ago

changed the description

vvv commented 4 years ago

assigned to @azheregelya

vvv commented 4 years ago

I propose the following structure of an archive file (hare_<hostname>.tar.xz) collected at node \<hostname>:

<hostname>/
├── build-ees-ha-args.yaml
├── cluster.sls
├── cluster.yaml
├── consul
│   ├── consul-agents.json
│   ├── consul-elect-rc-leader.log
│   ├── consul-kv.json
│   ├── consul-proto-rc.log
│   ├── consul-server-c1-conf.json
│   ├── consul-server-c2-conf.json
│   ├── consul-server-conf.json
│   └── consul-watch-service.log
├── ees-ha-csm-args.yaml
├── syslog.txt
└── systemctl-status.txt

1 directory, 14 files
vvv commented 4 years ago

files from the primary node

The “primary node” will contain /opt/seagate/eos-prvsnr/pillar/components/cluster.sls and /var/lib/hare/cluster.yaml.

vvv commented 4 years ago

changed title from Problem: {-the -}collection of Hare forensics is not automated to Problem: collection of Hare forensics is not automated

vvv commented 4 years ago

mentioned in issue #781

vvv commented 4 years ago

cc @azheregelya, @andriy.tkachuk