apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.41k stars 945 forks source link

[Bug] merge-into for s3: org.apache.paimon.fs.FileIOLoader: Provider org.apache.paimon.s3.S3Loader not a subtype #1156

Closed www2388258980 closed 1 year ago

www2388258980 commented 1 year ago

Search before asking

Paimon version

paimon0.4

Compute Engine

flink1.16

Minimal reproduce step

i want to use 'merge-into' to produce -D.

/home/hadoop/flink-1.16.0/bin/flink run \ -c org.apache.paimon.flink.action.FlinkActions \ -Dclassloader.resolve-order=parent-first \ /home/hadoop/flink-1.16.0/lib/paimon-flink-1.16-0.4-20230427.002025-43.jar \ merge-into \ --warehouse s3://xxxxx/hadoop/warehouse/ \ --database dwd_medatc_paimon \ --table items_details \ --target-as t \ --source-table ods_medatc_fts.src_public_items \ --on "t.id = src_public_items.id" \ --merge-actions not-matched-by-source-delete \ --not-matched-by-source-delete-condition "t.id is not null"

stack:

java.util.ServiceConfigurationError: org.apache.paimon.fs.FileIOLoader: Provider org.apache.paimon.s3.S3Loader not a subtype at java.util.ServiceLoader.fail(ServiceLoader.java:239) at java.util.ServiceLoader.access$300(ServiceLoader.java:185) at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:376) at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) at java.util.ServiceLoader$1.next(ServiceLoader.java:480) at java.util.Iterator.forEachRemaining(Iterator.java:116) at org.apache.paimon.fs.FileIO.discoverLoaders(FileIO.java:288) at org.apache.paimon.fs.FileIO.get(FileIO.java:248) at org.apache.paimon.catalog.CatalogFactory.createCatalog(CatalogFactory.java:99) at org.apache.paimon.catalog.CatalogFactory.createCatalog(CatalogFactory.java:62) at org.apache.paimon.flink.action.ActionBase.(ActionBase.java:84) at org.apache.paimon.flink.action.ActionBase.(ActionBase.java:78) at org.apache.paimon.flink.action.MergeIntoAction.(MergeIntoAction.java:125) at org.apache.paimon.flink.action.MergeIntoAction.create(MergeIntoAction.java:231) at org.apache.paimon.flink.action.Action$Factory.create(Action.java:137) at org.apache.paimon.flink.action.FlinkActions.main(FlinkActions.java:46) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222) at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:98) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:846) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:240) at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1090) at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1168)

What doesn't meet your expectations?

if this is bug,please repair

Anything else?

No response

Are you willing to submit a PR?

JingsongLi commented 1 year ago

The root cause is that Flink submitted the job, and the child classloader had one copy of the code, while the parent classloader had another copy of the same code, which caused S3 classes conflicts.

At present, we can fix it this way. We will create a paimon-flink-action module in the paimon-flink and migrate FlinkActions to it. The execution of Action depends on this jar. This can fundamentally solve the problem of classloader.

liugddx commented 1 year ago

Assign to me .

JingsongLi commented 1 year ago

Assign to me .

Hi if you want to take this, you should show some details to committers. I will take this.

See https://paimon.apache.org/docs/master/project/contributing/#code-contribution-guide

liugddx commented 1 year ago

Assign to me .

Hi if you want to take this, you should show some details to committers. I will take this.

See https://paimon.apache.org/docs/master/project/contributing/#code-contribution-guide

Got it.