apache / incubator-hugegraph-toolchain

HugeGraph toolchain - include a series useful graph modules
https://hugegraph.apache.org/
Apache License 2.0
82 stars 87 forks source link

[Improve] Loader support file name with prefix for hdfs source #570

Closed JackyYangPassion closed 3 months ago

JackyYangPassion commented 5 months ago

Search before asking

Feature Description (功能描述)

Loader support file name with prefix for hdfs source

when read data warehouse file, There will be some _success empty files, etc., and you need to specify the file name prefix.

struct.json

{
    "vertices":[
        {
            "label":"user",
            "input":{
                "type":"hdfs",
                "path":"hdfs:///vertex/louvain_user_vertex/part-*",
                "compression":"PARQUET",
                "core_site_path":"/conf/core-site.xml",
                "hdfs_site_path":"/conf/hdfs-site.xml",
                "header":[
                    "user_id"
                ],
                "charset":"UTF-8",
                "skipped_line":{
                    "regex":"(^#|^//).*"
                }
            },
            "id":"user_id",
            "null_values":[
                "NULL",
                "null",
                ""
            ]
        }
    ]
}

Are you willing to submit a PR?

Code of Conduct

dosubot[bot] commented 5 months ago

🤖

That's great to hear that you're willing to submit a PR! Your proposed improvement sounds very useful for handling file names with prefixes in the HDFS source. We appreciate your contribution to the project and look forward to reviewing your PR. Thank you for agreeing to follow the project's Code of Conduct. If you have any questions or need assistance while working on the PR, feel free to ask. Good luck with your contribution!


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.