aicers / giganto

Raw-Event Storage System for AICE
Apache License 2.0
5 stars 2 forks source link

Giganto: Raw-Event Storage System for AICE

Giganto is a high-performance raw-event storage system, specifically designed for AICE. It is optimized to receive and store raw events through QUIC channels and provides a flexible GraphQL API for querying the stored events. Giganto empowers AICE with the ability to efficiently handle large-scale data processing and real-time analytics.

Coverage Status

Features

Usage

You can run giganto by invoking the following command:

giganto <path to config file>

In the config file, you can specify the following options:

key = "key.pem"                            # path to private key file.
cert = "cert.pem"                          # path to certificate file.
root = "root.pem"                          # path to CA certificate file.
ingest_srv_addr = "0.0.0.0:38370"          # address to listen for ingest QUIC.
publish_srv_addr = "0.0.0.0:38371"         # address to listen for publish QUIC.
graphql_srv_addr = "127.0.0.1:8442"        # giganto's graphql address.
data_dir = "tests/data"                    # path to directory to store data.
retention = "100d"                         # retention period for data.
log_dir = "/data/logs/apps"                # path to giganto's syslog file.
export_dir = "tests/export"                # path to giganto's export file.
max_open_files = 8000                      # db options max open files.
max_mb_of_level_base = 512                 # db options max MB of rocksDB Level 1.
num_of_thread = 8                          # db options for background thread.
max_sub_compactions = 2                    # db options for sub-compaction.
ack_transmission = 1024                    # ack count for ingestion data.
addr_to_peers = "10.10.11.1:38383"          # address to listen for peers QUIC.
peers = [ { addr = "10.10.12.1:38383", hostname = "ai" } ]     # list of peer info.

By default, giganto reads the config file from the following directories:

For the max_mb_of_level_base, the last level has 100,000 times capacity, and it is about 90% of total capacity. Therefore, about db_total_mb / 111111 is appropriate. For example, 90MB or less for 10TB Database, 900MB or less for 100TB would be appropriate.

These values assume you've used all the way up to level 6, so the actual values may change if you want to grow your data further at the level base. So if it's less than 512MB, it's recommended to set default value of 512MB.

If there is no addr_to_peers option in the configuration file, it runs in standalone mode, and if there is, it runs in cluster mode for P2P.

Test

Run giganto with the prepared configuration file. (Settings to use the certificate/key from the tests folder.)

cargo run -- tests/config.toml

License

Copyright 2022-2024 ClumL Inc.

Licensed under Apache License, Version 2.0 (the "License"); you may not use this crate except in compliance with the License.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See LICENSE for the specific language governing permissions and limitations under the License.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be licensed as above, without any additional terms or conditions.