apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
7.99k stars 1.81k forks source link

java.util.LinkedHashMap cannot be cast to java.util.ArrayList #5700

Open wachoo opened 1 year ago

wachoo commented 1 year ago

Search before asking

What happened

when I click this button to start running a task in seatunnel web server image exception message burst in the log :

ERROR [qtp1175146719-7472] [GlobalExceptionHandler.logError():83] - java.util.LinkedHashMap cannot be cast to java.util.ArrayList java.lang.ClassCastException: java.util.LinkedHashMap cannot be cast to java.util.ArrayList at org.apache.seatunnel.core.starter.utils.ConfigShadeUtils.processConfig(ConfigShadeUtils.java:143) at org.apache.seatunnel.core.starter.utils.ConfigShadeUtils.decryptConfig(ConfigShadeUtils.java:119) at org.apache.seatunnel.core.starter.utils.ConfigShadeUtils.decryptConfig(ConfigShadeUtils.java:104) at org.apache.seatunnel.core.starter.utils.ConfigBuilder.ofInner(ConfigBuilder.java:53) at org.apache.seatunnel.core.starter.utils.ConfigBuilder.lambda$of$1(ConfigBuilder.java:67) at java.util.Optional.orElseGet(Optional.java:267) at org.apache.seatunnel.core.starter.utils.ConfigBuilder.of(ConfigBuilder.java:67) at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.(MultipleTableJobConfigParser.java:127) at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.getJobConfigParser(JobExecutionEnvironment.java:63) at org.apache.seatunnel.engine.core.job.AbstractJobEnvironment.getLogicalDag(AbstractJobEnvironment.java:109) at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.execute(JobExecutionEnvironment.java:73) at org.apache.seatunnel.app.service.impl.JobExecutorServiceImpl.executeJobBySeaTunnel(JobExecutorServiceImpl.java:107) at org.apache.seatunnel.app.service.impl.JobExecutorServiceImpl.jobExecute(JobExecutorServiceImpl.java:73) at org.apache.seatunnel.app.controller.JobExecutorController.jobExecutor(JobExecutorController.java:55) at sun.reflect.GeneratedMethodAccessor352.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498)

SeaTunnel Version

seatunnel 2.2.3 seatunnel-web 1.0.0

SeaTunnel Config

seatunnel:
  engine:
    history-job-expire-minutes: 1440
    backup-count: 1
    queue-type: blockingqueue
    print-execution-info-interval: 60
    print-job-metrics-info-interval: 60
    slot-service:
      dynamic-slot: true
    checkpoint:
      interval: 10000
      timeout: 60000
      storage:
        type: hdfs
        max-retained: 3
        plugin-config:
          namespace: /tmp/seatunnel/checkpoint_snapshot
          storage.type: hdfs
          fs.defaultFS: file:///tmp/ # Ensure that the directory has written permission

Running Command

when I click this button to start running a task in seatunnel web server

Error Exception

ERROR [qtp1175146719-7472] [GlobalExceptionHandler.logError():83] - java.util.LinkedHashMap cannot be cast to java.util.ArrayList
java.lang.ClassCastException: java.util.LinkedHashMap cannot be cast to java.util.ArrayList
    at org.apache.seatunnel.core.starter.utils.ConfigShadeUtils.processConfig(ConfigShadeUtils.java:143)
    at org.apache.seatunnel.core.starter.utils.ConfigShadeUtils.decryptConfig(ConfigShadeUtils.java:119)
    at org.apache.seatunnel.core.starter.utils.ConfigShadeUtils.decryptConfig(ConfigShadeUtils.java:104)
    at org.apache.seatunnel.core.starter.utils.ConfigBuilder.ofInner(ConfigBuilder.java:53)
    at org.apache.seatunnel.core.starter.utils.ConfigBuilder.lambda$of$1(ConfigBuilder.java:67)
    at java.util.Optional.orElseGet(Optional.java:267)
    at org.apache.seatunnel.core.starter.utils.ConfigBuilder.of(ConfigBuilder.java:67)
    at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.<init>(MultipleTableJobConfigParser.java:127)
    at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.getJobConfigParser(JobExecutionEnvironment.java:63)
    at org.apache.seatunnel.engine.core.job.AbstractJobEnvironment.getLogicalDag(AbstractJobEnvironment.java:109)
    at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.execute(JobExecutionEnvironment.java:73)
    at org.apache.seatunnel.app.service.impl.JobExecutorServiceImpl.executeJobBySeaTunnel(JobExecutorServiceImpl.java:107)
    at org.apache.seatunnel.app.service.impl.JobExecutorServiceImpl.jobExecute(JobExecutorServiceImpl.java:73)
    at org.apache.seatunnel.app.controller.JobExecutorController.jobExecutor(JobExecutorController.java:55)
    at sun.reflect.GeneratedMethodAccessor352.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)

Zeta or Flink or Spark Version

Zeta 2.3.3

Java or Scala Version

openjdk version "1.8.0_372"

Screenshots

image

Are you willing to submit PR?

Code of Conduct

anacondapy6 commented 1 year ago

2.3.3源码执行seatunnel-examples seatunnel-engine-examples遇到了同样的问题 盼望尽快解决

RenHuifeng commented 11 months ago

**web1.0.0+seatunnel2.3.3遇到了同样问题,帮忙解决解答谢谢

guodachao commented 11 months ago

遇到了一样的问题,盼早日解决

hailin0 commented 10 months ago

2.3.3源码执行seatunnel-examples seatunnel-engine-examples遇到了同样的问题 盼望尽快解决

@anacondapy6 Please add your job config to this issue

liunaijie commented 10 months ago

@guodachao @RenHuifeng @wachoo can anyone share your job config here to help debug? thanks.

guodachao commented 9 months ago

@guodachao @RenHuifeng @wachoo can anyone share your job config here to help debug? thanks. @liunaijie

{
"transform": {},
"sink": {
"Jdbc": {
"batch_size": 1000,
"max_retries": "3",
"source_table_name": "Table11843972568673",
"max_commit_attempts": 3,
"auto_commit": "true",
"url": "jdbc:mysql://host.docker.internal:3307/hpc?useSSL=false&useUnicode=true&characterEncoding=utf-8&allowMultiQueries=true&allowPublicKeyRetrieval=true&serverTimezone=Asia/Shanghai",
"is_exactly_once": "false",
"database": "hpc",
"password": "xxx",
"transaction_timeout_sec": -1,
"driver": "com.mysql.cj.jdbc.Driver",
"support_upsert_by_query_primary_key_exist": "false",
"connection_check_timeout_sec": 30,
"generate_sink_sql": true,
"user": "root",
"table": "hpc_xml_copy"
}
},
"source": {
"Jdbc": {
"password": "xxx",
"driver": "com.mysql.cj.jdbc.Driver",
"parallelism": 1,
"query": "SELECT `XML_ID`, `XML`, `XML_NAME`, `OPERATION_NAME`, `OPERATION_ID`, `OPERATION_TIME`, `GET_TYPE`, `STATUS` FROM `hpc`.`hpc_xml`",
"connection_check_timeout_sec": 30,
"result_table_name": "Table11843972568673",
"fetch_size": "100",
"user": "root",
"url": "jdbc:mysql://host.docker.internal:3307/hpc?useSSL=false&useUnicode=true&characterEncoding=utf-8&allowMultiQueries=true&allowPublicKeyRetrieval=true&serverTimezone=Asia/Shanghai"
}
},
"env": {
"job.mode": "BATCH",
"job.name": "SeaTunnel_Job",
"checkpoint.interval": "10000"
}
}
EricJoy2048 commented 9 months ago

This job config file get from seatunnel-web?

haneeshpld commented 9 months ago

@EricJoy2048 im suspecting the issue is, when the config is read from the file in working scenario it is transformed to the below format

{
    "env": {
        "job.mode": "BATCH",
        "job.name": "SeaTunnel_Job"
    },
    "source": [
        {
            "password": "xxxx",
            "driver": "com.mysql.cj.jdbc.Driver",
            "parallelism": 1,
            "query": "SELECT `id`, `name`, `owner`, `species`, `sex`, `birth`, `death` FROM `seatunnel`.`pet`",
            "connection_check_timeout_sec": 30,
            "result_table_name": "Table12261518544768",
            "plugin_name": "Jdbc",
            "user": "user",
            "url": "xxxx"
        }
    ],
    "transform": [],
    "sink": [
        {
            "batch_size": 1000,
            "max_retries": "10",
            "source_table_name": "Table12261518544768",
            "max_commit_attempts": 3,
            "auto_commit": "true",
            "plugin_name": "Jdbc",
            "url": "xxx",
            "is_exactly_once": "false",
            "database": "seasink",
            "password": "xxx",
            "transaction_timeout_sec": -1,
            "driver": "com.mysql.cj.jdbc.Driver",
            "support_upsert_by_query_primary_key_exist": "false",
            "connection_check_timeout_sec": 30,
            "generate_sink_sql": true,
            "user": "xxxx",
            "table": "pet"
        }
    ]
}

in a non working scenario it is trtransformed to below format(which is how it is stored in the file without any change in syntax)

{
    "transform": {},
    "sink": {
        "Jdbc": {
            "batch_size": 1000,
            "max_retries": "10",
            "source_table_name": "Table12307350135360",
            "max_commit_attempts": 3,
            "auto_commit": "true",
            "url": "xxx",
            "is_exactly_once": "false",
            "database": "seasink",
            "password": "xxx",
            "transaction_timeout_sec": -1,
            "driver": "com.mysql.cj.jdbc.Driver",
            "support_upsert_by_query_primary_key_exist": "false",
            "connection_check_timeout_sec": 30,
            "generate_sink_sql": true,
            "user": "xxxx",
            "table": "pet"
        }
    },
    "source": {
        "Jdbc": {
            "password": "xxx",
            "driver": "com.mysql.cj.jdbc.Driver",
            "parallelism": 1,
            "query": "SELECT `id`, `name`, `owner`, `species`, `sex`, `birth`, `death` FROM `seatunnel`.`pet`",
            "connection_check_timeout_sec": 30,
            "result_table_name": "Table12307350135360",
            "user": "xxx",
            "url": "xxx"
        }
    },
    "env": {
        "job.mode": "BATCH",
        "job.name": "SeaTunnel_Job"
    }
}
liunaijie commented 9 months ago

@EricJoy2048 @haneeshpld Hi team, the st-web is not easy to deploy locally. I check the code, this part of code maybe cause this bug.

SeaTunnelConfigUtil.java#L36 image

the full stack:

  1. https://github.com/apache/seatunnel-web/blob/1.0.0-release/seatunnel-server/seatunnel-app/src/main/java/org/apache/seatunnel/app/service/impl/JobExecutorServiceImpl.java#L66
  2. https://github.com/apache/seatunnel-web/blob/1.0.0-release/seatunnel-server/seatunnel-app/src/main/java/org/apache/seatunnel/app/service/impl/JobInstanceServiceImpl.java#L137
  3. https://github.com/apache/seatunnel-web/blob/1.0.0-release/seatunnel-server/seatunnel-app/src/main/java/org/apache/seatunnel/app/service/impl/JobInstanceServiceImpl.java#L166
  4. https://github.com/apache/seatunnel-web/blob/1.0.0-release/seatunnel-server/seatunnel-app/src/main/java/org/apache/seatunnel/app/utils/SeaTunnelConfigUtil.java#L36

I guess the st-web generated config file format is wrong.

haneeshpld commented 9 months ago

@EricJoy2048

I'm not sure if the is issue is occurring here. In my case, I have a working scenario as well as a non-working scenario.

When containerizing the application (both web and engine), it works if I run the image on the same machine where it is built. However, when I use the image on a different machine with almost identical configurations (same OS), it shows the above error.

Upon checking both the working and non-working scenarios, the above code produces the same output. Therefore, I im not sure the issue is arising from this code.

working scenario

Jdbc { "auto_commit"="true" "batch_size"=1000 "connection_check_timeout_sec"=30 database=seasink driver="com.mysql.cj.jdbc.Driver" "generate_sink_sql"=true "is_exactly_once"="false" "max_commit_attempts"=3 "max_retries"="10" password=xxx "source_table_name"=Table12348683256512 "support_upsert_by_query_primary_key_exist"="false" table=pet "transaction_timeout_sec"=-1 url=“xxxxx” user=xxx }

non workig scenario

Jdbc { "connection_check_timeout_sec"=30 "batch_size"=1000 "is_exactly_once"="false" "max_commit_attempts"=3 "transaction_timeout_sec"=-1 "max_retries"="10" "auto_commit"="true" "support_upsert_by_query_primary_key_exist"="false" "source_table_name"=Table12340885224320 "generate_sink_sql"=true database=seasink table=pet password=xxxx driver="com.mysql.cj.jdbc.Driver" user=xxx url=“xxxx” }

there is a file being written after the transformation in st-web (bin\profile\12261501203968.conf) which is in identical format in both working and non working scenario, when it is transformed again in the engine side we get the issue.

haneeshpld commented 9 months ago

Also another thing i noticed, not sure if its helpful,

image

there is a file ConfigParser.java, in which below part of the code im able to debugg in case of a working scenario, but not able to do the same in case of a non-working scenario, the debug points shows below error

image

this is the place where the transformation happens in the engine

leejoker commented 9 months ago

I resolve this problem with repackage the seatunnel-config-base module

Just change maven-shade-plugin like this:


<plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <configuration>
                    <minimizeJar>true</minimizeJar>
                    <createSourcesJar>true</createSourcesJar>
                    <shadeSourcesContent>true</shadeSourcesContent>
                    <shadedArtifactAttached>false</shadedArtifactAttached>
                    <createDependencyReducedPom>false</createDependencyReducedPom>
                    <filters>
                        <filter>
                            <artifact>com.typesafe:config</artifact>
                            <includes>
                                <include>**</include>
                            </includes>
                            <excludes>
                                <exclude>META-INF/MANIFEST.MF</exclude>
                                <exclude>META-INF/NOTICE</exclude>
                                <exclude>com/typesafe/config/ConfigParseOptions.class</exclude>
                                <exclude>com/typesafe/config/ConfigMergeable.class</exclude>
                                <exclude>com/typesafe/config/impl/ConfigParser.class</exclude>

                               <!-- just add this line to remove the inner class -->
                                <exclude>com/typesafe/config/impl/ConfigParser$ParseContext.class</exclude>

                                <exclude>com/typesafe/config/impl/ConfigNodePath.class</exclude>
                                <exclude>com/typesafe/config/impl/PathParser.class</exclude>
                                <exclude>com/typesafe/config/impl/Path.class</exclude>
                                <exclude>com/typesafe/config/impl/SimpleConfigObject.class</exclude>
                                <exclude>com/typesafe/config/impl/PropertiesParser.class</exclude>
                            </excludes>
                        </filter>
                    </filters>
                    <relocations>
                        <relocation>
                            <pattern>com.typesafe.config</pattern>
                            <shadedPattern>${seatunnel.shade.package}.com.typesafe.config</shadedPattern>
                        </relocation>
                    </relocations>
                    <transformers>
                        <transformer implementation="org.apache.maven.plugins.shade.resource.ApacheLicenseResourceTransformer" />
                        <transformer implementation="org.apache.maven.plugins.shade.resource.ApacheNoticeResourceTransformer" />
                    </transformers>
                </configuration>
                <executions>
                    <execution>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <phase>package</phase>
                    </execution>
                </executions>
            </plugin>

when package the config-base module, it will relocate the package paths, but the ConfigParser$ParseContext.class is still in it. This inner class conflicts with the same class in config-shade module.ClassLoader may not load the right one.

athiathu commented 8 months ago

Hi @leejoker I have checked with above solution but still the issue is there.

gitlifangyuan commented 8 months ago

Try it this way. Note that this is not a fundamental solution to the problem ,a temporary solution to the problem. Change the method: ConfigShadeUtils processConfig() Then install the 《seatunnel-core-starter》 module


 @SuppressWarnings("unchecked")
    private static Config processConfig(String identifier, Config config, boolean isDecrypted) {
        ConfigShade configShade = CONFIG_SHADES.getOrDefault(identifier, DEFAULT_SHADE);
        List<String> sensitiveOptions = new ArrayList<>(Arrays.asList(DEFAULT_SENSITIVE_OPTIONS));
        sensitiveOptions.addAll(Arrays.asList(configShade.sensitiveOptions()));
        BiFunction<String, Object, String> processFunction =
                (key, value) -> {
                    if (isDecrypted) {
                        return configShade.decrypt(value.toString());
                    } else {
                        return configShade.encrypt(value.toString());
                    }
                };
        String jsonString = config.root().render(ConfigRenderOptions.concise());
        ObjectNode jsonNodes = JsonUtils.parseObject(jsonString);
        Map<String, Object> configMap = JsonUtils.toMap(jsonNodes);
        List<Map<String, Object>> sources = new ArrayList<>();
        try {
            sources = (ArrayList<Map<String, Object>>) configMap.get(Constants.SOURCE);
        } catch (ClassCastException e) {
            log.info("ClassCastException : " + e.getMessage() );
            Map<String, Object> tmp = (Map<String, Object>) configMap.get(Constants.SOURCE);
            for (String key : tmp.keySet()) {
                Map<String, Object> tmp1  = (Map<String, Object>) tmp.get(key);
                tmp1.put("plugin_name",key);
                sources.add(tmp1);
            }

        }
        List<Map<String, Object>> sinks = new ArrayList<>();
        try {
            sinks = (ArrayList<Map<String, Object>>) configMap.get(Constants.SINK);
        } catch (ClassCastException e) {
            log.info("ClassCastException : " + e.getMessage() );
            Map<String, Object> tmp = (Map<String, Object>)configMap.get(Constants.SINK);
            for (String key : tmp.keySet()) {
                Map<String, Object> tmp1  = (Map<String, Object>) tmp.get(key);
                tmp1.put("plugin_name",key);
                sinks.add(tmp1);
            }
        }
        List<Map<String, Object>> transform = new ArrayList<>();
        try {
            transform = (ArrayList<Map<String, Object>>) configMap.get(Constants.TRANSFORM);
        } catch (ClassCastException e) {
            log.info("ClassCastException : " + e.getMessage() );
            Map<String, Object> tmp = (Map<String, Object>)configMap.get(Constants.TRANSFORM);
            for (String key : tmp.keySet()) {
                Map<String, Object> tmp1  = (Map<String, Object>) tmp.get(key);
                tmp1.put("plugin_name",key);
                transform.add(tmp1);
            }
        }
        Preconditions.checkArgument(
                !sources.isEmpty(), "Miss <Source> config! Please check the config file.");
        Preconditions.checkArgument(
                !sinks.isEmpty(), "Miss <Sink> config! Please check the config file.");
        sources.forEach(
                source -> {
                    for (String sensitiveOption : sensitiveOptions) {
                        source.computeIfPresent(sensitiveOption, processFunction);
                    }
                });
        sinks.forEach(
                sink -> {
                    for (String sensitiveOption : sensitiveOptions) {
                        sink.computeIfPresent(sensitiveOption, processFunction);
                    }
                });
        configMap.put(Constants.SOURCE, sources);
        configMap.put(Constants.TRANSFORM, transform);
        configMap.put(Constants.SINK, sinks);
        return ConfigFactory.parseMap(configMap);
    }
18308996929 commented 7 months ago

所以这个问题最后最好的解决方法是????

Etoakor commented 4 months ago

所以,一年了还没解决?

shenzhy5 commented 4 months ago

@leejoker Thanks!you are right!

shenzhy5 commented 4 months ago

@athiathu try install the seatunnel to your local repository and then package the seatunnel-web.

anhdbbt commented 4 months ago

@EricJoy2048 @haneeshpld Hi team, the st-web is not easy to deploy locally. I check the code, this part of code maybe cause this bug.

SeaTunnelConfigUtil.java#L36 image

the full stack:

1. https://github.com/apache/seatunnel-web/blob/1.0.0-release/seatunnel-server/seatunnel-app/src/main/java/org/apache/seatunnel/app/service/impl/JobExecutorServiceImpl.java#L66

2. https://github.com/apache/seatunnel-web/blob/1.0.0-release/seatunnel-server/seatunnel-app/src/main/java/org/apache/seatunnel/app/service/impl/JobInstanceServiceImpl.java#L137

3. https://github.com/apache/seatunnel-web/blob/1.0.0-release/seatunnel-server/seatunnel-app/src/main/java/org/apache/seatunnel/app/service/impl/JobInstanceServiceImpl.java#L166

4. https://github.com/apache/seatunnel-web/blob/1.0.0-release/seatunnel-server/seatunnel-app/src/main/java/org/apache/seatunnel/app/utils/SeaTunnelConfigUtil.java#L36

I guess the st-web generated config file format is wrong.


agree with your guess,but if i change these template , another problem occured which is caused by the template too. so ,i modify the code below,and it works! but ... Obviously,its not meticulously designed.

org.apache.seatunnel.core.starter.utils.ConfigShadeUtils.java

图片
liunaijie commented 6 hours ago

Hi, all. If you got same issue, try to delete the seatunnel-config-base library under seatunnel-web/libs.