Closed luohaifang closed 10 months ago
It seems to be an iceberg problem
@luohaifang Thanks a lot for trying Amoro and bringing this feedback.
Amoro 0.4.1 is a really old version for now. In version 0.4.1, there are still many limitations in the implementation of Amoro. For example, the Mixed Iceberg Format can only be built based on the Iceberg Hadoop Catalog, which is not suitable for use on S3 storage because rename operations cannot be used on S3 to ensure atomic commits. Based on the information you provided, I can't be entirely certain that this is due to using the Iceberg Hadoop Catalog on S3, but I suspect it is highly related.
I suggest you upgrade Amoro to the latest version: 0.6.0 or master, and test this scenario again. In version 0.6.0, Amoro now supports building Mixed Iceberg Format on all Iceberg Catalogs.
@luohaifang Thanks a lot for trying Amoro and bringing this feedback.
Amoro 0.4.1 is a really old version for now. In version 0.4.1, there are still many limitations in the implementation of Amoro. For example, the Mixed Iceberg Format can only be built based on the Iceberg Hadoop Catalog, which is not suitable for use on S3 storage because rename operations cannot be used on S3 to ensure atomic commits. Based on the information you provided, I can't be entirely certain that this is due to using the Iceberg Hadoop Catalog on S3, but I suspect it is highly related.
I suggest you upgrade Amoro to the latest version: 0.6.0 or master, and test this scenario again. In version 0.6.0, Amoro now supports building Mixed Iceberg Format on all Iceberg Catalogs.
Why I get "Not support Metastore [ams], Storage Type [S3], Table Format [MIXED_ICEBERG]". Does "Internal Catalog" mean "Amoro Metastore"? There seems to be no difference from version "0.4.1".
If catalog type is "Internal Catalog", and storage type is Hadoop, I suspect this problem will still occur. my core-site.xml is :
<configuration>
<property>
<name>fs.s3a.connection.ssl.enabled</name>
<value>false</value>
</property>
<property>
<name>fs.s3a.endpoint</name>
<value>http://xxx:9000</value>
</property>
<property>
<name>fs.s3a.access.key</name>
<value>admin</value>
</property>
<property>
<name>fs.s3a.secret.key</name>
<value>admin</value>
</property>
</configuration>
If you plan to use the Internal Catalog to employ Mixed Format on S3, I'm afraid you can only use the code from the master branch and manually build it, as this new feature has already been supported by #2157 on the master branch but not included in version 0.6.x.
BTW, you may need to use s3
protocol directly if you choose S3 storage.
After I upgraded to version 0.6.0, this problem no longer occurs. The catalog type used is "Internal Catalog" and the table type is "iceberg"
What happened?
hi~ Environmental configuration: flink: 1.15.4 store: minio ams: 0.4.1 table format: Mixed Iceberg metastore: Aarctic metastore
My operations: read oracle table, write to arctic lake table. the oracle table about 140,000,000 items data, after filtering based on time, about 10,000,000 pieces of data will be written to arctic.
When 3/4 is write to arctic, I will get this question:
Same, oracle table have 400,000,000 items data, when written 300,000,000 data, this problem will also occur.
The following is my execution statement: spark sql, create arctic table:
flink sql, execute write to arctic:
Excuse me, what is this question.
Affects Versions
0.4.1
What engines are you seeing the problem on?
No response
How to reproduce
No response
Relevant log output
No response
Anything else
No response
Are you willing to submit a PR?
Code of Conduct