X-lab2017 / open-digger

Open source analysis tools
https://open-digger.cn
Apache License 2.0
294 stars 86 forks source link

[Data] Add first batch of big data repos. #1050

Closed frank-zsy closed 2 years ago

frank-zsy commented 2 years ago

Related to #1047

I want to add some labeled data into OpenDigger to help us for our community analysis.

These repos are used to generate the big data report: https://github.com/X-lab2017/open-digger/blob/master/cooperations/%E5%BC%80%E6%BA%90%E5%A4%A7%E6%95%B0%E6%8D%AE%E7%83%AD%E5%8A%9B%E6%8A%A5%E5%91%8A2022.pdf

Label: Big Data

Type: Tech-0

Here I use Tech-0 which means a top level technical area label, under big data, we may have several Tech-1 level labels like streaming or visualization.

Repos:

frank-zsy commented 2 years ago

/parse-github-id

github-actions[bot] commented 2 years ago

Get repo and org/user ids done.

"Related to #1047

I want to add some labeled data into OpenDigger to help us for our community analysis.

These repos are used to generate the big data report: https://github.com/X-lab2017/open-digger/blob/master/cooperations/%E5%BC%80%E6%BA%90%E5%A4%A7%E6%95%B0%E6%8D%AE%E7%83%AD%E5%8A%9B%E6%8A%A5%E5%91%8A2022.pdf

Label: Big Data

Type: Tech-0

Here I use Tech-0 which means a top level technical area label, under big data, we may have several Tech-1 level labels like streaming or visualization.

Repos:

- 943149 # repo:d3/d3
- 9185792 # repo:apache/echarts
- 15111821 # repo:grafana/grafana
- 39464018 # repo:apache/superset
- 17165658 # repo:apache/spark
- 30203935 # repo:metabase/metabase
- 33884891 # repo:apache/airflow
- 60246359 # repo:ClickHouse/ClickHouse
- 2211243 # repo:apache/kafka
- 13926404 # repo:getredash/redash
- 20587599 # repo:apache/flink
- 49876476 # repo:apache/shardingsphere
- 1385122 # repo:matplotlib/matplotlib
- 85095608 # repo:airbnb/visx
- 45646037 # repo:plotly/plotly.js
- 5349565 # repo:prestodb/presto
- 84240850 # repo:timescale/timescaledb
- 149026292 # repo:cube-js/cube.js
- 23418517 # repo:apache/hadoop
- 6358188 # repo:apache/druid
- 117965972 # repo:alibaba/DataX
- 62117812 # repo:apache/pulsar
- 160999 # repo:apache/zookeeper
- 59737212 # repo:antvis/G2
- 51905353 # repo:apache/arrow
- 33653601 # repo:jupyter/notebook
- 139199684 # repo:PrefectHQ/prefect
- 173335706 # repo:apache/dolphinscheduler
- 4704710 # repo:mwaskom/seaborn
- 206424 # repo:apache/cassandra
- 11496279 # repo:c3js/c3
- 123345344 # repo:keplergl/kepler.gl
- 50205233 # repo:debezium/debezium
- 283046497 # repo:airbytehq/airbyte
- 32848140 # repo:apache/zeppelin
- 335164964 # repo:dataease/dataease
- 50904245 # repo:apache/beam
- 327859577 # repo:juicedata/juicefs
- 46398090 # repo:datahub-project/datahub
- 166515022 # repo:trinodb/trino
- 334274271 # repo:opensearch-project/OpenSearch
- 138754790 # repo:duckdb/duckdb
- 99919302 # repo:apache/doris
- 110222380 # repo:alibaba/BizCharts
- 14135470 # repo:apache/storm
- 53548867 # repo:dbt-labs/dbt-core
- 131619646 # repo:dagster-io/dagster
- 20089857 # repo:apache/hbase
- 206317 # repo:apache/camel
- 206444 # repo:apache/hive
- 50229487 # repo:apache/lucene-solr
- 19961085 # repo:apache/pinot
- 39979936 # repo:nhn/tui.chart
- 156293506 # repo:microsoft/SandDance
- 43158694 # repo:apache/incubator-heron
- 99412308 # repo:apache/incubator-seatunnel
- 76474200 # repo:apache/hudi
- 28738447 # repo:apache/kylin
- 23653453 # repo:pachyderm/pachyderm
- 21193524 # repo:apache/calcite
- 27911088 # repo:apache/nifi
- 158256479 # repo:apache/iceberg
- 102447494 # repo:edp963/davinci
- 282994686 # repo:ververica/flink-cdc-connectors
- 206417 # repo:apache/couchdb
- 7276954 # repo:Alluxio/alluxio
- 198368711 # repo:apache/incubator-linkis
- 402945349 # repo:StarRocks/starrocks
- 204164353 # repo:kestra-io/kestra
- 149626591 # repo:uber/aresdb
- 206459 # repo:apache/avro
- 158975124 # repo:apache/iotdb
- 141376301 # repo:apache/incubator-hugegraph
- 182849188 # repo:delta-io/delta
- 3786237 # repo:hazelcast/hazelcast
- 96424863 # repo:TalkingData/inmap
- 44781140 # repo:greenplum-db/gpdb
- 5683653 # repo:apache/drill
- 2442457 # repo:apache/ambari
- 384111310 # repo:apache/incubator-devlake
- 20675636 # repo:apache/parquet-mr
- 202483348 # repo:apache/incubator-kvrocks
- 1575956 # repo:apache/bookkeeper
- 50647838 # repo:apache/kudu
- 41712332 # repo:apache/incubator-pegasus
- 159273440 # repo:shzlw/poli
- 358917318 # repo:apache/arrow-datafusion
- 62117818 # repo:apache/carbondata
- 98013453 # repo:apache/atlas
- 114619105 # repo:apache/incubator-kyuubi
- 9342529 # repo:crate/crate
- 20675635 # repo:apache/parquet-format
- 341631350 # repo:apache/lucene
- 20473418 # repo:apache/phoenix
- 2153096 # repo:apache/sqoop
- 231533573 # repo:apache/inlong
- 56128733 # repo:apache/impala
- 59475316 # repo:eventql/eventql
- 32199982 # repo:apache/samza
- 41952293 # repo:apache/hawq
- 206357 # repo:apache/pig
- 2383782 # repo:apache/oozie
- 22305416 # repo:apache/ranger
- 8357227 # repo:biolab/orange3
- 37276906 # repo:keen/explorer
- 212382406 # repo:apache/ozone
- 341374920 # repo:apache/solr
- 2198510 # repo:apache/flume
- 2155500 # repo:apache/bigtop
- 9290699 # repo:apache/tez
- 54937496 # repo:PatMartin/Dex
- 25507371 # repo:apache/hadoop-hdfs
- 17310686 # repo:apache/knox
- 507775 # repo:elastic/elasticsearch
- 7833168 # repo:elastic/kibana
- 188779637 # repo:apache/incubator-streampark

"