awslabs / mountpoint-s3

A simple, high-throughput file client for mounting an Amazon S3 bucket as a local file system.
Apache License 2.0
4.45k stars 151 forks source link

File read occasionally fails with 'Input/output error (os error 5)' #974

Open andlr opened 1 month ago

andlr commented 1 month ago

Mountpoint for Amazon S3 version

1.7.2

AWS Region

eu-central-1/us-east-2

Describe the running environment

Running in ECS container using Container credential provider

Mountpoint options

mount-s3 <bucket> <mountpoint> --read-only --allow-root --cache=/mount-s3/cache/ --max-cache-size 1024 --log-directory=/mount-s3/logs/ --debug --metadata-ttl=3600 --log-metrics

What happened?

Sometimes file read operations fail with Input/output error (os error 5). In mountpoint-s3 there are logs with these messages:

Async error received from S3 and not recoverable from retry.
The provided token has expired.

It seems to often happen around 15 minutes after a mount, but it's not always the case, sometimes it fails after many hours. After some time, requests stop failing and file reads succeed again (without any umount/mount).

Important note - apart from mountpoint-s3 there's another service which works in the same container, uses the same credential provider and reads the exact same objects directly from S3 using AWS SDK for Rust. And there are no such failures in that service.

Relevant log output

2024-08-02 16:28:08.590 DEBUG awscrt::S3MetaRequest: id=0x7f3bd40433a0: Meta Request Default created request 0x7f3bd41f5550    
2024-08-02 16:28:08.590 DEBUG awscrt::task-scheduler: id=0x7f3bd419b5a0: Scheduling s3_meta_request_prepare_request_task task for immediate execution    
2024-08-02 16:28:08.590 DEBUG awscrt::task-scheduler: id=0x7f3bd419b5a0: Running s3_meta_request_prepare_request_task task with <Running> status    
2024-08-02 16:28:08.590 DEBUG awscrt::S3MetaRequest: id=0x7f3bd40433a0: Meta Request prepared request 0x7f3bd41f5550    
2024-08-02 16:28:08.590 DEBUG awscrt::AuthCredentialsProvider: (id=0x564571e15930) Credentials provider chain get credentials dispatch    
2024-08-02 16:28:08.590 DEBUG awscrt::AuthCredentialsProvider: (id=0x56457205f1f0) Cached credentials provider successfully sourced from cache    
2024-08-02 16:28:08.590 INFO awscrt::AuthCredentialsProvider: (id=0x564571e15930) Default chain credentials provider successfully sourced credentials    
2024-08-02 16:28:08.590 INFO awscrt::AuthSigning: (id=0x7f3bd804b3a0) Signing successfully built canonical request for algorithm SigV4, with contents 
2024-08-02 16:28:08.590 GET
2024-08-02 16:28:08.590 a-01gqsej1mgz3xeer6dhke0002d
2024-08-02 16:28:08.590 
2024-08-02 16:28:08.590 accept:*/*
2024-08-02 16:28:08.590 content-length:0
2024-08-02 16:28:08.590 host:<bucket name>.s3.eu-central-1.amazonaws.com
2024-08-02 16:28:08.590 if-match:"4f6ed553eac444a138c411ce7e9eb7ef"
2024-08-02 16:28:08.590 range:bytes=0-1175
2024-08-02 16:28:08.590 x-amz-content-sha256:UNSIGNED-PAYLOAD
2024-08-02 16:28:08.590 x-amz-date:20240802T132808Z
2024-08-02 16:28:08.590 x-amz-security-token: <...>
2024-08-02 16:28:08.590 
2024-08-02 16:28:08.590 accept;content-length;host;if-match;range;x-amz-content-sha256;x-amz-date;x-amz-security-token
2024-08-02 16:28:08.590 UNSIGNED-PAYLOAD
2024-08-02 16:28:08.590     
2024-08-02 16:28:08.590 INFO awscrt::AuthSigning: (id=0x7f3bd804b3a0) Signing successfully built string-to-sign via algorithm SigV4, with contents 
2024-08-02 16:28:08.590 AWS4-HMAC-SHA256
2024-08-02 16:28:08.590 20240802T132808Z
2024-08-02 16:28:08.590 20240802/eu-central-1/s3/aws4_request
2024-08-02 16:28:08.590 18299144967c909aeec8d6c5816680e3bf5c2ed70bb815a91b53d649f8dde4ad
2024-08-02 16:28:08.590     
2024-08-02 16:28:08.590 INFO awscrt::AuthSigning: (id=0x7f3bd804b3a0) Http request successfully built final authorization value via algorithm SigV4, with contents 
2024-08-02 16:28:08.590 Credential=ASIA5UTHRWQPO4UAS7MA/20240802/eu-central-1/s3/aws4_request, SignedHeaders=accept;content-length;host;if-match;range;x-amz-content-sha256;x-amz-date;x-amz-security-token, Signature=bb3895d27afbb0154908ef580bbe1f3aceaf34368ffe9802fffd4caa798ef133
2024-08-02 16:28:08.590     
2024-08-02 16:28:08.590 DEBUG fuser::request: FUSE(492358) ino 0x000000000000206c READ fh FileHandle(62288), offset 516096, size 20480    
2024-08-02 16:28:08.590 DEBUG awscrt::S3Client: id=0x564571f4d3a0 Updating connections, assigning requests where possible.    
2024-08-02 16:28:08.590 DEBUG awscrt::standard-retry-strategy: id=0x564571f542d0: attempting to acquire retry token for partition_id <bucket name>.s3.eu-central-1.amazonaws.com    
2024-08-02 16:28:08.590 DEBUG awscrt::standard-retry-strategy: id=0x564571f542d0: bucket 0x7f3bd40035a0 for partition_id <bucket name>.s3.eu-central-1.amazonaws.com found    
2024-08-02 16:28:08.590 DEBUG fuser::request: FUSE(492360) ino 0x000000000000206c READ fh FileHandle(62288), offset 540672, size 69632    
2024-08-02 16:28:08.590 DEBUG awscrt::exp-backoff-strategy: id=0x564571f54770: Initializing retry token 0x7f3bd40d2820    
2024-08-02 16:28:08.590 DEBUG awscrt::task-scheduler: id=0x7f3bd40d2890: Scheduling aws_exponential_backoff_retry_task task for immediate execution    
2024-08-02 16:28:08.590 INFO awscrt::S3ClientStats: id=0x564571f4d3a0 Requests-in-flight(approx/exact):3/3  Requests-preparing:1  Requests-queued:0  Requests-network(get/put/default/total):0/0/2/2  Requests-streaming-waiting:0  Requests-streaming-response:0  Endpoints(in-table/allocated):1/1    
2024-08-02 16:28:08.591 DEBUG awscrt::task-scheduler: id=0x7f3bd42216f0: Running s2n_delayed_shutdown task with <Running> status    
2024-08-02 16:28:08.591 DEBUG awscrt::tls-handler: id=0x7f3bd4221520: Delayed shut down in write direction    
2024-08-02 16:28:08.591 DEBUG awscrt::task-scheduler: id=0x7f3bd411b890: Scheduling socket_written_task task for immediate execution    
2024-08-02 16:28:08.591 DEBUG awscrt::channel: id=0x7f3bd4076ef0: handler 0x7f3bd4221520 shutdown in write dir completed.    
2024-08-02 16:28:08.591 DEBUG awscrt::socket: id=0x7f3bd414e580 fd=0: closing    
2024-08-02 16:28:08.591 DEBUG awscrt::task-scheduler: id=0x7f3bd405ab90: Scheduling epoll_event_loop_unsubscribe_cleanup task for immediate execution    
2024-08-02 16:28:08.591 DEBUG awscrt::task-scheduler: id=0x7f3bd411b890: Running socket_written_task task with <Canceled> status    
2024-08-02 16:28:08.591 DEBUG awscrt::task-scheduler: id=0x7f3bd4169990: Scheduling socket_handler_close task for immediate execution    
2024-08-02 16:28:08.591 DEBUG awscrt::task-scheduler: id=0x7f3bd4170d00: Running s2n_delayed_shutdown task with <Running> status    
2024-08-02 16:28:08.591 DEBUG awscrt::tls-handler: id=0x7f3bd4170b30: Delayed shut down in write direction    
2024-08-02 16:28:08.591 DEBUG awscrt::task-scheduler: id=0x7f3bd416fc40: Scheduling socket_written_task task for immediate execution    
2024-08-02 16:28:08.591 DEBUG awscrt::channel: id=0x7f3bd40b43e0: handler 0x7f3bd4170b30 shutdown in write dir completed.    
2024-08-02 16:28:08.591 DEBUG awscrt::socket: id=0x7f3bd4126430 fd=2: closing    
2024-08-02 16:28:08.591 DEBUG awscrt::task-scheduler: id=0x7f3bd40b4870: Scheduling epoll_event_loop_unsubscribe_cleanup task for immediate execution    
2024-08-02 16:28:08.591 DEBUG awscrt::task-scheduler: id=0x7f3bd416fc40: Running socket_written_task task with <Canceled> status    
2024-08-02 16:28:08.591 DEBUG awscrt::task-scheduler: id=0x7f3bd00cd4e0: Scheduling socket_handler_close task for immediate execution    
2024-08-02 16:28:08.591 DEBUG awscrt::task-scheduler: id=0x564571f4d580: Scheduling s3_client_process_work_task task for immediate execution    
2024-08-02 16:28:08.591 DEBUG awscrt::task-scheduler: id=0x7f3b84010fd0: Scheduling FutureTaskWaker_wake_by_ref task for immediate execution    
2024-08-02 16:28:08.591 DEBUG awscrt::task-scheduler: id=0x7f3bd40d2890: Running aws_exponential_backoff_retry_task task with <Running> status    
2024-08-02 16:28:08.591 DEBUG awscrt::exp-backoff-strategy: id=0x564571f54770: Vending retry_token 0x7f3bd40d2820    
2024-08-02 16:28:08.591 DEBUG awscrt::standard-retry-strategy: id=0x564571f542d0: token acquired callback invoked with error Success. with token 0x7f3bd40d8b00 and nested token 0x7f3bd40d2820    
2024-08-02 16:28:08.591 DEBUG awscrt::connection-manager: id=0x7f3bd4259640: Acquire connection    
2024-08-02 16:28:08.591 DEBUG awscrt::connection-manager: id=0x7f3bd4259640: Grabbing pooled connection (0x7f3bd419b030)    
2024-08-02 16:28:08.591 DEBUG awscrt::connection-manager: id=0x7f3bd4259640: snapshot - state=1, idle_connection_count=1, pending_acquire_count=0, pending_settings_count=0, pending_connect_count=0, vended_connection_count=2, open_connection_count=5, ref_count=1    
2024-08-02 16:28:08.591 DEBUG awscrt::connection-manager: id=0x7f3bd4259640: Successfully completed connection acquisition with connection id=0x7f3bd419b030    
2024-08-02 16:28:08.592 DEBUG awscrt::http-stream: id=0x7f3bd40eb3e0: Created client request on connection=0x7f3bd419b030: GET a-01gqsc8zrw6g505jgae8qp4w4g HTTP/1.1    
2024-08-02 16:28:08.592 DEBUG awscrt::task-scheduler: id=0x7f3bd419b128: Scheduling http1_connection_cross_thread_work task for immediate execution    
2024-08-02 16:28:08.592 DEBUG awscrt::task-scheduler: id=0x7f3bd405ab90: Running epoll_event_loop_unsubscribe_cleanup task with <Running> status    
2024-08-02 16:28:08.592 DEBUG awscrt::task-scheduler: id=0x7f3bd4169990: Running socket_handler_close task with <Running> status    
2024-08-02 16:28:08.592 DEBUG awscrt::channel: id=0x7f3bd4076ef0: handler 0x7f3bd41698f0 shutdown in write dir completed.    
2024-08-02 16:28:08.592 DEBUG awscrt::task-scheduler: id=0x7f3bd4076f18: Scheduling (null) task for immediate execution    
2024-08-02 16:28:08.592 DEBUG awscrt::task-scheduler: id=0x7f3bd40b4870: Running epoll_event_loop_unsubscribe_cleanup task with <Running> status    
2024-08-02 16:28:08.592 DEBUG awscrt::task-scheduler: id=0x7f3bd00cd4e0: Running socket_handler_close task with <Running> status    
2024-08-02 16:28:08.592 DEBUG awscrt::channel: id=0x7f3bd40b43e0: handler 0x7f3bd00cd440 shutdown in write dir completed.    
2024-08-02 16:28:08.592 DEBUG awscrt::task-scheduler: id=0x7f3bd40b4408: Scheduling (null) task for immediate execution    
2024-08-02 16:28:08.592 DEBUG awscrt::task-scheduler: id=0x564571f4d580: Running s3_client_process_work_task task with <Running> status    
2024-08-02 16:28:08.592 DEBUG awscrt::S3Client: id=0x564571f4d3a0 s_s3_client_process_work_default - Moving relevant synced_data into threaded_data.    
2024-08-02 16:28:08.592 DEBUG awscrt::S3Client: id=0x564571f4d3a0 s_s3_client_process_work_default - Processing any new meta requests.    
2024-08-02 16:28:08.592 DEBUG awscrt::S3Client: id=0x564571f4d3a0 Updating meta requests.    
2024-08-02 16:28:08.592 DEBUG awscrt::S3Client: id=0x564571f4d3a0 Updating connections, assigning requests where possible.    
2024-08-02 16:28:08.592 DEBUG awscrt::standard-retry-strategy: id=0x564571f542d0: attempting to acquire retry token for partition_id <bucket name>.s3.eu-central-1.amazonaws.com    
2024-08-02 16:28:08.592 DEBUG awscrt::standard-retry-strategy: id=0x564571f542d0: bucket 0x7f3bd40035a0 for partition_id <bucket name>.s3.eu-central-1.amazonaws.com found    
2024-08-02 16:28:08.592 DEBUG awscrt::exp-backoff-strategy: id=0x564571f54770: Initializing retry token 0x7f3bd426d640    
2024-08-02 16:28:08.592 DEBUG awscrt::task-scheduler: id=0x7f3bd426d6b0: Scheduling aws_exponential_backoff_retry_task task for immediate execution    
2024-08-02 16:28:08.592 DEBUG awscrt::task-scheduler: id=0x7f3bd426d6b0: Running aws_exponential_backoff_retry_task task with <Running> status    
2024-08-02 16:28:08.592 DEBUG awscrt::exp-backoff-strategy: id=0x564571f54770: Vending retry_token 0x7f3bd426d640    
2024-08-02 16:28:08.592 DEBUG awscrt::standard-retry-strategy: id=0x564571f542d0: token acquired callback invoked with error Success. with token 0x7f3bd40d1a90 and nested token 0x7f3bd426d640    
2024-08-02 16:28:08.593 DEBUG awscrt::connection-manager: id=0x7f3bd4259640: Acquire connection    
2024-08-02 16:28:08.593 DEBUG awscrt::connection-manager: id=0x7f3bd4259640: Grabbing pooled connection (0x7f3bd41ed6f0)    
2024-08-02 16:28:08.593 DEBUG awscrt::connection-manager: id=0x7f3bd4259640: snapshot - state=1, idle_connection_count=0, pending_acquire_count=0, pending_settings_count=0, pending_connect_count=0, vended_connection_count=3, open_connection_count=5, ref_count=1    
2024-08-02 16:28:08.593 DEBUG awscrt::task-scheduler: id=0x7f3bcc0023b0: Running (null) task with <Running> status    
2024-08-02 16:28:08.593 INFO awscrt::S3ClientStats: id=0x564571f4d3a0 Requests-in-flight(approx/exact):3/3  Requests-preparing:0  Requests-queued:0  Requests-network(get/put/default/total):0/0/3/3  Requests-streaming-waiting:0  Requests-streaming-response:0  Endpoints(in-table/allocated):1/1    
2024-08-02 16:28:08.593 DEBUG awscrt::task-scheduler: id=0x7f3b84010fd0: Running FutureTaskWaker_wake_by_ref task with <Running> status    
2024-08-02 16:28:08.594 DEBUG awscrt::task-scheduler: id=0x7f3bd412c420: Running (null) task with <Running> status    
2024-08-02 16:28:08.594 DEBUG awscrt::task-scheduler: id=0x7f3bd40f9fd0: Scheduling schedule_cross_thread_tasks task for immediate execution    
2024-08-02 16:28:08.594 DEBUG awscrt::task-scheduler: id=0x7f3bd419b128: Running http1_connection_cross_thread_work task with <Running> status    
2024-08-02 16:28:08.594 DEBUG awscrt::task-scheduler: id=0x7f3bd4062cd0: Scheduling socket_written_task task for immediate execution    
2024-08-02 16:28:08.594 DEBUG awscrt::task-scheduler: id=0x7f3bd4076f18: Running (null) task with <Running> status    
2024-08-02 16:28:08.594 DEBUG awscrt::channel: id=0x7f3bd4076ef0: during shutdown, canceling task 0x7f3bd4221550    
2024-08-02 16:28:08.594 DEBUG awscrt::task-scheduler: id=0x7f3bd4221550: Running tls_timeout task with <Canceled> status    
2024-08-02 16:28:08.594 DEBUG awscrt::channel-bootstrap: id=0x564571f4a290: channel 0x7f3bd4076ef0 shutdown with error 0.    
2024-08-02 16:28:08.594 INFO awscrt::http-connection: 0x7f3bd41e4f30: Client shutdown completed with error 0 (AWS_ERROR_SUCCESS).    
2024-08-02 16:28:08.594 DEBUG awscrt::connection-manager: id=0x7f3bd4259640: shutdown received for connection (id=0x7f3bd41e4f30)    
2024-08-02 16:28:08.594 DEBUG awscrt::connection-manager: id=0x7f3bd4259640: snapshot - state=1, idle_connection_count=0, pending_acquire_count=0, pending_settings_count=0, pending_connect_count=0, vended_connection_count=3, open_connection_count=4, ref_count=1    
2024-08-02 16:28:08.594 DEBUG awscrt::channel: id=0x7f3bd4076ef0: destroying channel.    
2024-08-02 16:28:08.594 DEBUG awscrt::task-scheduler: id=0x7f3bd4076fc8: Running gather_statistics task with <Canceled> status    
2024-08-02 16:28:08.594 DEBUG awscrt::channel-bootstrap: id=0x564571f4a290: releasing bootstrap reference    
2024-08-02 16:28:08.594 DEBUG awscrt::task-scheduler: id=0x7f3bd40b4408: Running (null) task with <Running> status    
2024-08-02 16:28:08.594 DEBUG awscrt::channel: id=0x7f3bd40b43e0: during shutdown, canceling task 0x7f3bd4170b60    
2024-08-02 16:28:08.594 DEBUG awscrt::task-scheduler: id=0x7f3bd4170b60: Running tls_timeout task with <Canceled> status    
2024-08-02 16:28:08.594 DEBUG awscrt::channel-bootstrap: id=0x564571f4a290: channel 0x7f3bd40b43e0 shutdown with error 0.    
2024-08-02 16:28:08.594 INFO awscrt::http-connection: 0x7f3bd40d72f0: Client shutdown completed with error 0 (AWS_ERROR_SUCCESS).    
2024-08-02 16:28:08.594 DEBUG awscrt::connection-manager: id=0x7f3bd4259640: shutdown received for connection (id=0x7f3bd40d72f0)    
2024-08-02 16:28:08.594 DEBUG awscrt::connection-manager: id=0x7f3bd4259640: snapshot - state=1, idle_connection_count=0, pending_acquire_count=0, pending_settings_count=0, pending_connect_count=0, vended_connection_count=3, open_connection_count=3, ref_count=1    
2024-08-02 16:28:08.594 DEBUG awscrt::channel: id=0x7f3bd40b43e0: destroying channel.    
2024-08-02 16:28:08.594 DEBUG awscrt::task-scheduler: id=0x7f3bd40b44b8: Running gather_statistics task with <Canceled> status    
2024-08-02 16:28:08.594 DEBUG awscrt::channel-bootstrap: id=0x564571f4a290: releasing bootstrap reference    
2024-08-02 16:28:08.594 DEBUG awscrt::task-scheduler: id=0x7f3bd40f9fd0: Running schedule_cross_thread_tasks task with <Running> status    
2024-08-02 16:28:08.594 DEBUG awscrt::connection-manager: id=0x7f3bd4259640: Successfully completed connection acquisition with connection id=0x7f3bd41ed6f0    
2024-08-02 16:28:08.595 DEBUG awscrt::http-stream: id=0x7f3bd40e90d0: Created client request on connection=0x7f3bd41ed6f0: GET a-01gqsej1mgz3xeer6dhke0002d HTTP/1.1    
2024-08-02 16:28:08.595 DEBUG awscrt::task-scheduler: id=0x7f3bd41ed7e8: Scheduling http1_connection_cross_thread_work task for immediate execution    
2024-08-02 16:28:08.595 DEBUG awscrt::task-scheduler: id=0x7f3bd4062cd0: Running socket_written_task task with <Running> status    
2024-08-02 16:28:08.595 DEBUG awscrt::task-scheduler: id=0x7f3bd419b0c0: Scheduling http1_connection_outgoing_stream task for immediate execution    
2024-08-02 16:28:08.595 DEBUG awscrt::task-scheduler: id=0x7f3bd41ed7e8: Running http1_connection_cross_thread_work task with <Running> status    
2024-08-02 16:28:08.595 DEBUG awscrt::task-scheduler: id=0x7f3bd40d8e30: Scheduling socket_written_task task for immediate execution    
2024-08-02 16:28:08.595 DEBUG awscrt::task-scheduler: id=0x7f3bd419b0c0: Running http1_connection_outgoing_stream task with <Running> status    
2024-08-02 16:28:08.595 DEBUG awscrt::task-scheduler: id=0x7f3bd40d8e30: Running socket_written_task task with <Running> status    
2024-08-02 16:28:08.595 DEBUG awscrt::task-scheduler: id=0x7f3bd41ed780: Scheduling http1_connection_outgoing_stream task for immediate execution    
2024-08-02 16:28:08.595 DEBUG awscrt::task-scheduler: id=0x7f3bd41ed780: Running http1_connection_outgoing_stream task with <Running> status    
2024-08-02 16:28:08.597 DEBUG fuser::request: FUSE(492362) ino 0x000000000000206c READ fh FileHandle(62288), offset 610304, size 36864    
2024-08-02 16:28:08.611 DEBUG fuser::request: FUSE(492364) ino 0x000000000000206c READ fh FileHandle(62288), offset 647168, size 131072    
2024-08-02 16:28:08.613 DEBUG awscrt::http-stream: id=0x7f3bd40e90d0: Client request complete, response status: 206 (Partial Content).    
2024-08-02 16:28:08.613 DEBUG awscrt::S3MetaRequest: id=0x7f3bd40433a0: Request 0x7f3bd41f5550 finished with error code 14357 (aws-c-s3: AWS_ERROR_S3_NON_RECOVERABLE_ASYNC_ERROR, Async error received from S3 and not recoverable from retry.) and response status 206    
2024-08-02 16:28:08.613 ERROR awscrt::S3MetaRequest: id=0x7f3bd40433a0 Meta request cannot recover from error 14357 (Async error received from S3 and not recoverable from retry.). (request=0x7f3bd41f5550, response status=206)
unexge commented 1 month ago

Thanks for reporting the issue, @andlr! We'll investigate it. Meanwhile, could you please add --debug-crt and share logs with us? This will give us more clue on what's going on, see Troubleshooting > Credential Errors for more details.

andlr commented 1 month ago

The logs I've provided above were already with --debug-crt Here's a larger portion of logs, if that will help mount_s3_logs.txt

unexge commented 1 month ago

Thanks for the detailed logs @andlr! We'll be looking into the issue