apache / iceberg

Apache Iceberg
https://iceberg.apache.org/
Apache License 2.0
6.23k stars 2.17k forks source link

FileNotFoundException can occur in some scenarios. (data file & HADOOP CATALOG) #9327

Open BsoBird opened 9 months ago

BsoBird commented 9 months ago

Apache Iceberg version

1.4.2 (latest release)

Query engine

Spark

Please describe the bug 🐞

SPARK 3.4.1.

Caused by: java.io.FileNotFoundException: File does not exist: /iceberg-catalog/warehouse/dwd/b_std_category/data/00012-4526569-b66acfb2-bea0-46af-a6c8-01d9d1731b35-00001.orc

We found that, in some cases, ICEBERG table may be FileNotFoundException (hadoop catalog table). The thing is like this:

  1. We will start to execute the MERGE INTO operation of 8 tables at 1:00 a.m. every day. After the operation is completed, the following three operations are performed.

    CALL xxx.system.rewrite_manifests('dwd.b_std_category', false);
    CALL xxx.system.remove_orphan_files(table => 'dwd.b_std_category');
    CALL xxx.system.expire_snapshots(table => 'dwd.b_std_category', retain_last => 10);
  2. Today, when executing the MERGE operation, an OOM occurred, causing the container to be killed.

    Job aborted due to stage failure: Authorized committer (attemptNumber=0, stage=415011, partition=1392) failed; but task commit success, data duplication may happen. reason=ExecutorLostFailure(3759,true,Some(Container killed by YARN for exceeding physical memory limits. 40.1 GB of 40 GB physical memory used. Consider boosting spark.executor.memoryOverhead.))
  3. At the time of the OOM, the dwd.b_std_category table was executing this command.

    CALL xxx.system.expire_snapshots(table => 'dwd.b_std_category', retain_last => 10);
  4. When we resumed the SPARK task, we found that the dwd.b_std_category table could not be read.

BsoBird commented 9 months ago

@RussellSpitzer @nastra can you help me?

BsoBird commented 9 months ago

I also found another situation where I have a table that has never had a CALL command executed on it. All I did was to run a MERGE INTO once a day, but after the OOM(Write interrupt), the table did happend same thing. It's been so long that I didn't keep a log of that scenario.

BsoBird commented 9 months ago

@chennurchaitanya Please check if this question is similar to yours We can discuss this issue together here

BsoBird commented 9 months ago

Slack link: https://apache-iceberg.slack.com/archives/C025PH0G1D4/p1700291064787019

RussellSpitzer commented 9 months ago

When I see this sort of thing it's usually one of two issues

  1. The user has accidentally run some command which deletes files without talking to iceberg, the snapshot refers to a file which no longer exists. This was done accidentally so the file must be removed via a metadata delete to get the table back to a healthy status.

    Common reasons for this

    • User has mutliple tables homed in the same directory, Remove orphan files for one deletes files for the other. This is generally unrecoverable and you'll notice that many files do not exist that should according to iceberg metadata
    • User error, manually deleting a file.
    • Third party ttl system
  2. The query is running a snapshot which has been expired but not yet removed from cache, or the query was started and then the snapshot was expired. In this case you just refresh the cache and the query will work

BsoBird commented 9 months ago

@RussellSpitzer

--User has mutliple tables homed in the same directory, Remove orphan files for one deletes files for the other. This is generally unrecoverable and you'll notice that many files do not exist that should according to iceberg metadata no,Our catalogues and tables are one-to-one.

--User error, manually deleting a file. no. No one but me can manipulate iceberg's data files. And I just run the CALL command every day.

--Third party ttl system no. we haven't.

--The query is running a snapshot which has been expired but not yet removed from cache, or the query was started and then the snapshot was expired. In this case you just refresh the cache and the query will work How do I flush the cache? I've restarted the spark job. I've restarted the spark job. If it's an in-memory cache, then it shouldn't have a problem. If it's cached in memory, then it shouldn't be a problem, and I do have a snapshot of the file.

BsoBird commented 9 months ago

I will submit a PR for fix.

Zhangg7723 commented 9 months ago

Was the MERGE operation committed successfully? if it was successfully and committed files were removed by remove_orphan_files job, the latest snapshot will refer to lost files.

BsoBird commented 9 months ago

@Zhangg7723 Our data table does not have concurrent operations. I am very certain of this.

Zhangg7723 commented 9 months ago

step 3:At the time of the OOM, the dwd.b_std_category table was executing this command.

BsoBird commented 9 months ago

Does that cause any problems?