apache / incubator-hugegraph-toolchain

HugeGraph toolchain - include a series useful graph modules
https://hugegraph.apache.org/
Apache License 2.0
88 stars 91 forks source link

[Enhancement] Improve user experience for loader #442

Open imbajin opened 1 year ago

imbajin commented 1 year ago

Bug Type (问题类型)

others (please comment below)

Before submit

Environment (环境信息)

Expected & Actual behavior (期望与实际表现)

some problems need to solve:

Vertex/Edge example (问题点 / 边数据举例)

No response

Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)

No response

liuxiaocs7 commented 9 months ago

--help seems works and -h is same as --host

    -h, --host
      The host/IP of HugeGraphServer
      Default: localhost

./bin/hugegraph-loader.sh --help

output:

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/root1/lx/Development/incubator-hugegraph-toolchain/apache-hugegraph-toolchain-incubating-1.2.0/apache-hugegraph-loader-incubating-1.2.0/lib/log4j-slf4j-impl-2.18.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/root1/lx/Development/incubator-hugegraph-toolchain/apache-hugegraph-toolchain-incubating-1.2.0/apache-hugegraph-loader-incubating-1.2.0/lib/apache-hugegraph-loader-incubating-1.2.0-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/root1/lx/Development/incubator-hugegraph-toolchain/apache-hugegraph-toolchain-incubating-1.2.0/apache-hugegraph-loader-incubating-1.2.0/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Usage: <main class> [options]
  Options:
    --batch-insert-threads
      The number of threads to execute batch insert
      Default: 16
    --batch-size
      The number of lines in each submit
      Default: 500
    --cdc-flush-interval
      The flush interval for flink cdc
      Default: 30000
    --cdc-sink-parallelism
      The sink parallelism for flink cdc
      Default: 1
    --check-vertex
      Check vertices exists while inserting edges
      Default: false
    --clear-all-data
      Whether to clear all old data before loading
      Default: false
    --clear-timeout
      The timeout waiting for clearing all data
      Default: 240
    --dry-run
      Dry run means that only parse but doesn't load
      Default: false
    --edge-partitions
      The number of partitions of the HBase edge table
      Default: 64
    --edge-table-name
      HBase edge table name
    --failure-mode
      Load data from the failure records, in this mode, only full load is 
      supported, any read or parsing errors will cause load task stop
      Default: false
  * -f, --file
      The path of the data mapping description file
  * -g, --graph
      The namespace of the graph to load into
    --hbase-zk-parent
      HBase zookeeper parent
    --hbase-zk-port
      HBase zookeeper port
    --hbase-zk-quorum
      HBase zookeeper quorum
    --help
      Print usage of HugeGraphLoader
    -h, --host
      The host/IP of HugeGraphServer
      Default: localhost
    --incremental-mode
      Load data from the breakpoint of last time
      Default: false
    --max-conn
      Max number of HTTP connections to server
      Default: 64
    --max-conn-per-route
      Max number of HTTP connections to each route
      Default: 32
    --max-insert-errors
      The maximum number of lines that insert error before exiting
      Default: 500
    --max-parse-errors
      The maximum number of lines that parse error before exiting
      Default: 1
    --max-read-errors
      The maximum number of lines that read error before exiting
      Default: 1
    --max-read-lines
      The maximum number of read lines, when reached this number, the load 
      task will stop
      Default: -1
    -p, --port
      The port of HugeGraphServer
      Default: 8080
    --print-progress
      Whether to print real-time load progress
      Default: true
    --protocol
      The protocol of HugeGraphServer, allowed values are: http or https
      Default: http
    --retry-interval
      Setting the interval time before retrying
      Default: 10
    --retry-times
      Setting the max retry times when loading timeout
      Default: 3
    -s, --schema
      The schema file path which to create manually
    --shutdown-timeout
      The timeout of awaitTermination in seconds
      Default: 10
    --single-insert-threads
      The number of threads to execute single insert
      Default: 8
    --sink-type
      Sink to different storage
      Default: true
    --test-mode
      Whether the hugegraph-loader work in test mode
      Default: false
    --timeout
      The timeout of HugeClient request
      Default: 60
    --token
      The token of graph for authentication
    --trust-store-file
      The path of client truststore file used when https protocol is enabled
    --trust-store-password
      The password of client truststore file used when https protocol is 
      enabled 
    --username
      The username of graph for authentication
    --vertex-partitions
      The number of partitions of the HBase vertex table
      Default: 64
    --vertex-table-name
      HBase vertex table name
imbajin commented 9 months ago

--help seems works and -h is same as --host

@liuxiaocs7 yes, I do know that, I mean:

  1. we should provide -help for users (not only --help), although -h is barely used for cmd, rename it to -i (ip) is better (but currently could keep it)
  2. check the input, when input param < 3 (required minimum num) OR input error, directly forward/show --help info
  3. mark graph name as not Required (default value is fine to use)
liuxiaocs7 commented 9 months ago

--help seems works and -h is same as --host

@liuxiaocs7 yes, I do know that, I mean:

  1. we should provide -help for users (not only --help), although -h is barely used for cmd, rename it to -i (ip) is better (but currently could keep it)
  2. check the input, when input param < 3 (required minimum num) OR input error, directly forward/show --help info
  3. mark graph name as not Required (default value is fine to use)

thanks, got it, sorry for forgot to set it to draft-pr :(