apache / incubator-hugegraph-toolchain

HugeGraph toolchain - include a series useful graph modules
https://hugegraph.apache.org/
Apache License 2.0
89 stars 91 forks source link

[Bug] Hubble 创建属性节点中文,数据库乱码 #621

Closed kouhangting closed 1 hour ago

kouhangting commented 1 month ago

Bug Type (问题类型)

others (please edit later)

Before submit

Environment (环境信息)

Expected & Actual behavior (期望与实际表现)

创建节点中文变??

使用的镜像 image

image

Vertex/Edge example (问题点 / 边数据举例)

No response

Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)

No response

dosubot[bot] commented 1 month ago

Hello, @kouhangting! I'm here to help you with any issues, bugs, or questions you have. Let me know how I can assist you with your repository.

To resolve the issue where creating a node with Chinese attributes in HugeGraph results in unexpected characters or behavior, you need to ensure that the property keys are correctly defined to handle text values, and that the input values are properly encoded.

Here is an example of how you can add a vertex with Chinese attributes:

HugeGraph graph = graph();

// Define the schema if not already defined
SchemaManager schema = graph.schema();
schema.propertyKey("name").asText().create();
schema.propertyKey("city").asText().create();
schema.vertexLabel("person").properties("name", "city").create();

// Add a vertex with Chinese attributes
graph.addVertex(T.label, "person", "name", "张三", "city", "北京");
this.commitTx();

Ensure that the text encoding is properly handled in your environment to avoid any issues with character representation. If you encounter any specific errors related to character encoding, you may need to check your database and application settings to ensure they support UTF-8 encoding [1].

Additionally, HugeGraph handles character encoding for property values, including non-ASCII characters like Chinese, by specifying the character set when writing and reading files. For example, in the provided test cases, the GBK charset is used to handle Chinese characters:

private static final Charset GBK = Charset.forName("GBK");

@Test
public void testMatchedEncodingCharset() {
    ioUtil.write("vertex_software.csv", GBK,
                 "name,lang,price",
                 "lop,中文,328");

    String[] args = new String[]{
            "-f", structPath("matched_encoding_charset/struct.json"),
            "-s", configPath("matched_encoding_charset/schema.groovy"),
            "-g", GRAPH,
            "-h", SERVER,
            "--batch-insert-threads", "2",
            "--test-mode", "true"
    };
    HugeGraphLoader.main(args);

    List<Vertex> vertices = CLIENT.graph().listVertices();
    Assert.assertEquals(1, vertices.size());
    Vertex vertex = vertices.get(0);
    Assert.assertEquals("lop", vertex.property("name"));
    Assert.assertEquals("中文", vertex.property("lang"));
    Assert.assertEquals(328.0, vertex.property("price"));
}

In this test, the GBK charset is used to correctly handle the Chinese characters in the vertex_software.csv file. The test verifies that the property values are correctly encoded and decoded, ensuring that non-ASCII characters are properly managed [2].

To continue talking to Dosu, mention @dosu.

kouhangting commented 1 month ago

image

使用gremlin 语句也是一样的

imbajin commented 1 month ago

@kouhangting Try to use HTTP-RESTful tool to test it? (Avoid encode problem in hubble)

shenhaitao010 commented 1 month ago

同样的问题,使用apache-hugegraph-hubble-incubating-1.3.0创建中文的属性,或者导入数据,中文全变成了???。使用restful api可以中文正常创建。请问如何解决使用hubble页面创建schame和导入数据呢中文成???的问题呢,谢谢!

shenhaitao010 commented 1 month ago

@kouhangting Try to use HTTP-RESTful tool to test it? (Avoid encode problem in hubble) 如果只使用api的话,那你现在提供的hubble页面还要啥用,免不了会使用中文。

insist93 commented 4 hours ago

同样的问题,使用apache-hugegraph-hubble-incubating-1.3.0创建中文的属性,或者导入数据,中文全变成了???。使用restful api可以中文正常创建。请问如何解决使用hubble页面创建schame和导入数据呢中文成???的问题呢,谢谢!

docker 启动hugegraph/hubble的时候,加上环境变量 -e LANG=C.UTF-8 -e LANGUAGE=C.UTF-8 -e LC_ALL=C.UTF-8

imbajin commented 57 minutes ago

同样的问题,使用apache-hugegraph-hubble-incubating-1.3.0创建中文的属性,或者导入数据,中文全变成了???。使用restful api可以中文正常创建。请问如何解决使用hubble页面创建schame和导入数据呢中文成???的问题呢,谢谢!

docker 启动hugegraph/hubble的时候,加上环境变量 -e LANG=C.UTF-8 -e LANGUAGE=C.UTF-8 -e LC_ALL=C.UTF-8

@insist93 最新的 PR 里已经内置加了一些这种检查, 等待新的 hubble docker 镜像更新可以再试试, 应该就不会有这个问题了, 如果有欢迎随时反馈 (镜像应该是明天自动更新)