Open chennurchaitanya opened 1 year ago
Could you please share the full stack trace on when the error happens and the action that you're running when it happens? Also an overview of the xyz.files
would be helpful
Is there a particular reason that you're using Hadoop's S3AFileSystem
? You could switch to using S3FileIO
when using Minio. Is it possible that s3a://XXXXXXXXXX/data/event_ts_day=1970-01-01/00007-167930-e397f970-eb12-41ad-9e10-e855c8fd6e53-00001.parquet
got deleted in the meantime by another job? Also can you share the output of xyz.files
and your catalog configuration just in case?
Sorry it's a bit difficult to tell exactly what's going on without having full access to the logs and knowing what actions were happening at that time
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.
Is there a particular reason that you're using Hadoop's
S3AFileSystem
? You could switch to usingS3FileIO
when using Minio. Is it possible thats3a://XXXXXXXXXX/data/event_ts_day=1970-01-01/00007-167930-e397f970-eb12-41ad-9e10-e855c8fd6e53-00001.parquet
got deleted in the meantime by another job? Also can you share the output ofxyz.files
and your catalog configuration just in case?
I am facing something similar. One of the files got deleted (still trying to figure out how) is there way to fix this? I tried rewriting manifest files and data files but nothing seems to work.
Apache Iceberg version
1.1.0
Query engine
Spark
Please describe the bug 🐞
My Job was running fine for a long time and today we got this exception.
Getting exception "Caused by: java.io.FileNotFoundException: No such file or directory", while accessing data from iceberg table using below code snippet.
val df = spark.read.format("iceberg").option("start-snapshot-id", start_snapshot).option("end-snapshot-id", end_snapshot).load("mytablename")
We are using minio as our storage backend. We have file_path entry in mytable.files but physical file is not present. As per my understanding iceberg have strong write consistency , unless all files are written into storage backend, snapshot will not be created.
We tried to run orphan file removal, but doesn't help.
Can someone from iceberg SME's , let us know why this would happen and how to resolve ?