MatrixAI / Polykey

Polykey Core Library
https://polykey.com
GNU General Public License v3.0
31 stars 4 forks source link

Nodes CLI integration #217

Closed joshuakarp closed 3 years ago

joshuakarp commented 3 years ago

Specification

Now that gestalt discovery and identities CLI has been merged in, nodes is the next domain that needs to be integrated into the CLI.

Additional context

Previous work on the nodes domain can be found in the following MRs:

Tasks

  1. Elicit relevant nodes commands
  2. Create proto and GRPC foundation
  3. Link with commands
  4. Tests
CMCDragonkai commented 3 years ago

With respect to #211. The tests for Nodes CLI and all other CLI tests except session-specific commands in agent subcommands should just use a fixture that creates the session, and then share that session token for all command tests that end up using the grpc commands calling into client service. So you should plan out a common fixture mechanism for this.

CMCDragonkai commented 3 years ago

This is done in client-refactoring but it is going to take time for me to fully review and create new issues for things to polish on. I'll post links.

CMCDragonkai commented 3 years ago

Post-merge review for subsequent fix (this will be updated and posted onto new issue/issues):

  1. Use DI for NodeGraph, this should allow us to maintain the try except style of start in all of our domains and the started.
  2. Use a lock mutex in NodeManager, there are multiple in-memory states that you are managing that may need to be kept consistent. Either way it will future proof future enhancements of NodeManager, also make sure you're doing this against the other stateful classes.
  3. Networking errors are strangely managed here. The networking domain works a bit differently from the rest of the other domains. At any point in time, an exception may be thrown if the connection has timed out. In the networking domain this is managed by exception handlers that are attached to the relevant connections. The NodeManager cannot expect to "catch" these networking exceptions on a single control flow branch. They must attach handlers to each NodeConnection. This means the "handling" of exceptions itself is a stateful live operation. You must then design and consider the NodeManager as a state machine managing multiple connection state machines. Ensure that all errors caught are network domain level errors, you shouldn't be catching naked UTP_ETIMEDOUT errors.
  4. Based on the naming standard of #218, the errors in src/bin/errors.ts are not properly written, they should be named in such a way that they can be grouped into the relevant domain commands. For example ErrorPingNodeFailed vs ErrorNodePingFailed.
  5. The src/bin/errors.ts should be kept pretty minimal. Most of the errors there should just be handled immediately in the relevant CLI code that uses the commander, I can see errors that relate to validation of the command line arguments for specific commands. They should not even be exceptions at all in this case. Only when the exception is general across all CLI commands should they go there. In particular ErrorPingNodeFailed does not make sense to be in src/bin/errors, that sounds like a nodes domain error. I believe errors coming from ErrorCLI should be "final errors" that is caught by the src/bin/polykey main function and given a message format of last resort.
  6. Remove all uses of console.log. They should be replaced with the logger if debug/info statements, and stdout output if they are legitimate output. Use ag console.log to find them. They are only meant to be used during development.
  7. The usage of GRPC functions nodesGetLocalDetails and nodesGetDetails should be refactored. Getting information about the current node should be composed of individual commands to each domain. There isn't a need for a "one stop shop" of getting node details. See comment about this: https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/198#note_635099080 This should propagate to the GUI.
  8. Any node that is added that turns out to be invalid, should be expunged from the any state in the nodes.
  9. The createMetadata function seems rather useless. It just does new grpc.Metadata(). Is that necessary to wrap it? It's not even being used by anything else. It should be removed.
  10. Don't combine promise combinators with await async syntax. This is better done with just a try and catch. This applies to src/nodes/NodeConnection.ts.
  11. Follow the "doc" standard. Only use // for inline comments. You can have block comments on top of classes and methods, and don't specify parameters, the typescript automatically takes care of them.
  12. Beware of the NotStarted exceptions. I sometimes forget to add them in. But technically they should exist on all the instance methods. This is mostly there to enforce state machine mechanics.
  13. Methods that return exceptions should actually be exceptional. Consider the function getRootCertChain, if it is expected that a NodeConnection represents a live connection, and therefore has a root certificate available, then it indeed an exception. If it is expected that there may be times where it doesn't have it, then it should be optionally return undefined and not an exception.
  14. The getRootCertChain should just use the getServerCertChain method provided by the GRPC client. You already have the GRPCClientClient object under this.client.
  15. Notice that I use ._x property when I want to expose the property to the external caller but not allow them to set it. This is combined with a getter get x() pattern. This is only needed when we cannot use the readable prefix which fixes it at construction and even prevents mutation in the methods. Combine with Readable type constructor when available.
CMCDragonkai commented 3 years ago

Regarding point 14:

Between agent to agent, tls is on the networking layer and the grpc is plaintext, but client (CLI & GUI) to agent, no networking layer is used, and therefore tls occurs on the grpc level. So I realised that therefore you ignore point 14.