Superskyyy / sw-community

Using CHAOSS + Apache Kibble to analyze the Apache SkyWalking ecosystem
Apache License 2.0
3 stars 0 forks source link
chaoss community kibble open-source skywalking

SkyWalking Community Analysis

TODO List:

  1. Project config generator
  2. More organizations domain
  3. Custom dashboard for missing visualizations

Introduction

A collection of useful settings and scripts for SkyWalking stakeholders to setup CHAOSS GrimoireLab in seconds.

Collects Git, GitHub data from 30 SkyWalking Ecosystem projects(please open a PR if more projects come in).

Apache Official SkyAPM Others
skywalking SkyAPM-dotnet sourceplusplus-SourceMarker
skywalking-rocketbot-ui SkyAPM-go2sky
skywalking-website SkyAPM-go2sky-plugins
skywalking-nginx-lua SkyAPM-php-sdk
skywalking-kong SkyAPM-cpp2sky
skywalking-python SkyAPM-java-plugin-extensions
skywalking-nodejs SkyAPM-uranus
skywalking-client-js SkyAPM-nodejs
skywalking-rust SkyAPM-mini-program-monitor

skywalking-satellite skywalking-cli skywalking-kubernetes skywalking-swck skywalking-docker skywalking-data-collect-protocol skywalking-query-protocol skywalking-goapi skywalking-agent-test-tool skywalking-infra-e2e skywalking-eyes skywalking-java skywalking-banyandb skywalking-showcase

Identified data sources

For the SkyWalking ecosystem, major data sources are

  1. Git for commit logs - timezone, code, developer
  2. GitHub for
    • GitHub repo metadata - star, fork, watcher
    • GitHub PRs - efficiency, timing, backlog
    • GitHub Issues - efficiency, backlog, timing
    • GitHub comments - collaboration network, count,
    • GitHub events - issue labels
  3. mail list (RSS) - sentiment analysis, count
  4. IRC channels - Slack + QQ - nlp task, count
  5. Social media - Twitter - social impact
  6. Q&A platforms - StackExchange - user generated data

From the above data sources, we can conduct quantitative and qualitative analysis.

CHAOSS Introduction

CHAOSS Metrics

GrimoireLab

Note the affliation analysis is based on a email domains, which wrongly identifies qq.com and vip.qq.com as Tencent employees.

So the provided version removes this chunk from organizations.json in default-grimoirelab-settings.

"Tencent": [
    {
        "domain": "qq.com",
        "is_top": true
    },
    {
        "domain": "vip.qq.com",
        "is_top": true
    }
],

Example Screenshot from Dashboard -

localhost_5601_app_kibana (1)

localhost_5601_app_kibana (2)

Current Usage

Manual -

  1. Download the latest release from GrimoireLab repository, unzip/untar.

  2. Copy the projects.json to the default-grimoirelab-settings folder.

  3. You may wish to set more than one GitHub access token in setup.cfg given hourly GitHub API limits.

  4. You will need to increase the max_map_count for Elasticsearch before bringing up the ES container, else it will fail.

WSL2 on Windows machine-

wsl ^
sudo sysctl -w vm.max_map_count=262144

Linux

sudo sysctl -w vm.max_map_count=262144

MacOS

$ screen ~/Library/Containers/com.docker.docker/Data/vms/0/tty
(then run:) sysctl -w vm.max_map_count=262144
  1. Start the docker containers
docker-compose up -d
  1. After the initialization step, go to localhost:5601 to view the kibana dashboard of GrimoireLab.

Note the data will not appear in a short time if your access to GitHub is slow/ unstable, try to use FastGitHub to accelerate the process, else it could take you hours before the projects are fully collected.