4finance / micro-infra-spring

Repository containing default microservice infrastructure set up using Spring configuration
Apache License 2.0
203 stars 49 forks source link

Make it possible to draw diagrams of dependencies between services #79

Closed marcingrzejszczak closed 9 years ago

marcingrzejszczak commented 10 years ago

We would have to go around each of the microservices inside zookeeper and check their collaborators. Then repeat that recursively until we draw the full diagram of the dependencies.

nurkiewicz commented 10 years ago

Ideas:

  1. Connect to Zookeeper in order to find all microservices in the environment (logical names and all instance URLs)
  2. Call /collaborators for each microservice to analyze availability and connectivity. We will get rich information like: which services are up and which connections are working. Bonus point: checking health endpoint
  3. Produce JSON representation of dependency graph and overall system status from information above. Either each microservice can expose such data or specialized microservice (?)
  4. Implement a simple JavaScript dashboard for drawing dependency graph, e.g. in D3.js. See: demo.
  5. JavaScript dashboard should update graph in real time.
  6. In JavaScript we can choose what type of information is shown (uptime, latencies, etc.)
nurkiewicz commented 9 years ago

Currently open PR https://github.com/4finance/micro-infra-spring/pull/143 introduces several improvements:

  1. /collaborators includes all instances (URLs) of my collaborators, not just random one chosen by Curator:

    curl -qs localhost:8095/collaborators | python -mjson.tool
    {
       "com/ofg/twitter-places-collerator": {
           "http://127.0.1.1:8096": "DOWN",
           "http://127.0.1.1:8097": "UP"
       }
    }
  2. /collaborators/all endpoint aggregates /collaborator responses from all instances of all microservices:

    curl -qs localhost:8095/collaborators/all | python -mjson.tool
    {
       "com/ofg/twitter-places-analyzer": {
           "http://127.0.1.1:8095": {
               "collaborators": {
                   "com/ofg/twitter-places-collerator": {
                       "http://127.0.1.1:8096": "DOWN",
                       "http://127.0.1.1:8097": "UP"
                   }
               },
               "status": "UP"
           }
       },
       "com/ofg/twitter-places-collerator": {
           "http://127.0.1.1:8096": {
               "collaborators": {},
               "status": "DOWN"
           },
           "http://127.0.1.1:8097": {
               "collaborators": {},
               "status": "UP"
           }
       },
       "com/ofg/social-engine": {
           "http://127.0.1.1:8098": {
               "collaborators": {
                   "com/ofg/twitter-places-analyzer": {
                       "http://127.0.1.1:8095": "UP"
                   }
               },
               "status": "UP"
           }
       }
    }
  3. /collaborators/view.html is a simple JavaScript view of data returned from collaborators/all:

    Graph image

    Red arrow means that connectivity between given two services is broken. Red circle represents unresponsive service. Notice that it's possible to have working (green) microservice but some connections to it broken. The opposite would be quite unusual.

  4. The API (especially ServiceResolve) was refined to always use service path (e.g. com/ofg/twitter-places-collerator) rather than aliases (collerator), which are local to each microservice and should never escape.
  5. Because changes to /collaborators aren't backward compatible, /collaborators/all has a fallback mode to work with older clients. If clients are so old that they don't support /colleborators (pre-0.5.3), meta-fallback is used that calls barely /ping.

I will place description above in Wiki once merged.

Areas to improve:

  1. Use Hystrix to parallelize /collaborators invocations of each instance (now calling /collabortors/all can take significant time as it traverses instances one by one). Also hystrix will automatically apply timeout. Make sure each microservice uses separate thread pool
  2. Automatically refresh JavaScript view when services come and go. Brute-force - periodic refresh. Elegant: listening on ZK changes and pushing data from client to server.
mchmielarz commented 9 years ago

+1

marcingrzejszczak commented 9 years ago

Definitely +1 :)

nurkiewicz commented 9 years ago

I'm looking for a review, not applause :-).

marcingrzejszczak commented 9 years ago

Don't flatter yourself :P I think that those +1 are written after a review and not before it ;)