Open kamal-rahimi opened 10 months ago
Hi @kamal-rahimi, thanks for reaching out. Could you provide debug logs of this behavior? You can get debug logs by adding --debug
to your command, and redacting any sensitive information. Thanks!
Hi @RyanFitzSimmonsAK , here is part of output when using debug:
[DEBUG] [2023-12-26T22:11:56Z] [00007f96067fc700] [task-scheduler] - id=0x7f96ac000d80: Running epoll_event_loop_unsubscribe_cleanup task with <Canceled> status
[INFO] [2023-12-26T22:11:56Z] [00007f96067fc700] [event-loop] - id=0x2b0d230: Destroying event_loop
[INFO] [2023-12-26T22:11:56Z] [00007f96067fc700] [event-loop] - id=0x2b0d230: Stopping event-loop thread.
[DEBUG] [2023-12-26T22:11:56Z] [00007f96c2ffd700] [task-scheduler] - id=0x2b0e2b8: Scheduling epoll_event_loop_stop task for immediate execution
[DEBUG] [2023-12-26T22:11:56Z] [00007f96c2ffd700] [task-scheduler] - id=0x2b0e2b8: Running epoll_event_loop_stop task with <Running> status
[DEBUG] [2023-12-26T22:11:56Z] [00007f96c2ffd700] [event-loop] - id=0x2b0d230: exiting main loop
[DEBUG] [2023-12-26T22:11:56Z] [00007f96c2ffd700] [task-scheduler] - id=0x7f96b8000d80: Scheduling epoll_event_loop_unsubscribe_cleanup task for immediate execution
[DEBUG] [2023-12-26T22:11:56Z] [00007f96067fc700] [task-scheduler] - id=0x7f96b8000d80: Running epoll_event_loop_unsubscribe_cleanup task with <Canceled> status
[INFO] [2023-12-26T22:11:56Z] [00007f96067fc700] [event-loop] - id=0x2b0ca80: Destroying event_loop
[INFO] [2023-12-26T22:11:56Z] [00007f96067fc700] [event-loop] - id=0x2b0ca80: Stopping event-loop thread.
[DEBUG] [2023-12-26T22:11:56Z] [00007f96c37fe700] [task-scheduler] - id=0x2b0dda8: Scheduling epoll_event_loop_stop task for immediate execution
[DEBUG] [2023-12-26T22:11:56Z] [00007f96c37fe700] [task-scheduler] - id=0x2b0dda8: Running epoll_event_loop_stop task with <Running> status
[DEBUG] [2023-12-26T22:11:56Z] [00007f96c37fe700] [event-loop] - id=0x2b0ca80: exiting main loop
[DEBUG] [2023-12-26T22:11:56Z] [00007f96c37fe700] [task-scheduler] - id=0x7f96b4000f50: Scheduling epoll_event_loop_unsubscribe_cleanup task for immediate execution
[DEBUG] [2023-12-26T22:11:56Z] [00007f96067fc700] [task-scheduler] - id=0x7f96b4000f50: Running epoll_event_loop_unsubscribe_cleanup task with <Canceled> status
[INFO] [2023-12-26T22:11:56Z] [00007f96067fc700] [event-loop] - id=0x2b0c8f0: Destroying event_loop
[INFO] [2023-12-26T22:11:56Z] [00007f96067fc700] [event-loop] - id=0x2b0c8f0: Stopping event-loop thread.
[DEBUG] [2023-12-26T22:11:56Z] [00007f96c3fff700] [task-scheduler] - id=0x2b0d898: Scheduling epoll_event_loop_stop task for immediate execution
[DEBUG] [2023-12-26T22:11:56Z] [00007f96c3fff700] [task-scheduler] - id=0x2b0d898: Running epoll_event_loop_stop task with <Running> status
[DEBUG] [2023-12-26T22:11:56Z] [00007f96c3fff700] [event-loop] - id=0x2b0c8f0: exiting main loop
[DEBUG] [2023-12-26T22:11:56Z] [00007f96c3fff700] [task-scheduler] - id=0x7f96bc000f50: Scheduling epoll_event_loop_unsubscribe_cleanup task for immediate execution
[DEBUG] [2023-12-26T22:11:56Z] [00007f96067fc700] [task-scheduler] - id=0x7f96bc000f50: Running epoll_event_loop_unsubscribe_cleanup task with <Canceled> status
[INFO] [2023-12-26T22:11:56Z] [00007f96067fc700] [event-loop] - id=0x2b0c1c0: Destroying event_loop
[INFO] [2023-12-26T22:11:56Z] [00007f96067fc700] [event-loop] - id=0x2b0c1c0: Stopping event-loop thread.
[DEBUG] [2023-12-26T22:11:56Z] [00007f96e0ff9700] [task-scheduler] - id=0x2b0cff8: Scheduling epoll_event_loop_stop task for immediate execution
[DEBUG] [2023-12-26T22:11:56Z] [00007f96e0ff9700] [task-scheduler] - id=0x2b0cff8: Running epoll_event_loop_stop task with <Running> status
[DEBUG] [2023-12-26T22:11:56Z] [00007f96e0ff9700] [event-loop] - id=0x2b0c1c0: exiting main loop
[DEBUG] [2023-12-26T22:11:56Z] [00007f96e0ff9700] [task-scheduler] - id=0x7f96c8000e60: Scheduling epoll_event_loop_unsubscribe_cleanup task for immediate execution
[DEBUG] [2023-12-26T22:11:56Z] [00007f96067fc700] [task-scheduler] - id=0x7f96c8000e60: Running epoll_event_loop_unsubscribe_cleanup task with <Canceled> status
[INFO] [2023-12-26T22:11:56Z] [00007f96067fc700] [event-loop] - id=0x2b0c030: Destroying event_loop
[INFO] [2023-12-26T22:11:56Z] [00007f96067fc700] [event-loop] - id=0x2b0c030: Stopping event-loop thread.
[DEBUG] [2023-12-26T22:11:56Z] [00007f96e17fa700] [task-scheduler] - id=0x2b0c758: Scheduling epoll_event_loop_stop task for immediate execution
[DEBUG] [2023-12-26T22:11:56Z] [00007f96e17fa700] [task-scheduler] - id=0x2b0c758: Running epoll_event_loop_stop task with <Running> status
[DEBUG] [2023-12-26T22:11:56Z] [00007f96e17fa700] [event-loop] - id=0x2b0c030: exiting main loop
[DEBUG] [2023-12-26T22:11:56Z] [00007f96e17fa700] [task-scheduler] - id=0x7f96c4000f50: Scheduling epoll_event_loop_unsubscribe_cleanup task for immediate execution
[DEBUG] [2023-12-26T22:11:56Z] [00007f96067fc700] [task-scheduler] - id=0x7f96c4000f50: Running epoll_event_loop_unsubscribe_cleanup task with <Canceled> status
[INFO] [2023-12-26T22:11:56Z] [00007f96067fc700] [event-loop] - id=0x2b0b8a0: Destroying event_loop
[INFO] [2023-12-26T22:11:56Z] [00007f96067fc700] [event-loop] - id=0x2b0b8a0: Stopping event-loop thread.
[DEBUG] [2023-12-26T22:11:56Z] [00007f96e1ffb700] [task-scheduler] - id=0x2b0be98: Scheduling epoll_event_loop_stop task for immediate execution
[DEBUG] [2023-12-26T22:11:56Z] [00007f96e1ffb700] [task-scheduler] - id=0x2b0be98: Running epoll_event_loop_stop task with <Running> status
[DEBUG] [2023-12-26T22:11:56Z] [00007f96e1ffb700] [event-loop] - id=0x2b0b8a0: exiting main loop
[DEBUG] [2023-12-26T22:11:56Z] [00007f96e1ffb700] [task-scheduler] - id=0x7f96d0000f50: Scheduling epoll_event_loop_unsubscribe_cleanup task for immediate execution
[DEBUG] [2023-12-26T22:11:56Z] [00007f96067fc700] [task-scheduler] - id=0x7f96d0000f50: Running epoll_event_loop_unsubscribe_cleanup task with <Canceled> status
[DEBUG] [2023-12-26T22:11:56Z] [00007f96067fc700] [S3Client] - id=0x2b08f80 Client body streaming ELG shutdown.
[DEBUG] [2023-12-26T22:11:56Z] [00007f9781ffb700] [task-scheduler] - id=0x2b09160: Scheduling s3_client_process_work_task task for immediate execution
[DEBUG] [2023-12-26T22:11:56Z] [00007f9781ffb700] [task-scheduler] - id=0x2b09160: Running s3_client_process_work_task task with <Running> status
[DEBUG] [2023-12-26T22:11:56Z] [00007f9781ffb700] [S3Client] - id=0x2b08f80 s_s3_client_process_work_default - Moving relevant synced_data into threaded_data.
[DEBUG] [2023-12-26T22:11:56Z] [00007f9781ffb700] [S3Client] - id=0x2b08f80 s_s3_client_process_work_default - Processing any new meta requests.
[DEBUG] [2023-12-26T22:11:56Z] [00007f9781ffb700] [S3Client] - id=0x2b08f80 Updating meta requests.
[DEBUG] [2023-12-26T22:11:56Z] [00007f9781ffb700] [S3Client] - id=0x2b08f80 Updating connections, assigning requests where possible.
[INFO] [2023-12-26T22:11:56Z] [00007f9781ffb700] [S3ClientStats] - id=0x2b08f80 Requests-in-flight(approx/exact):0/0 Requests-preparing:0 Requests-queued:0 Requests-network(get/put/default/total):0/0/0/0 Requests-streaming-waiting:0 Requests-streaming-response:0 Endpoints(in-table/allocated):0/0
[DEBUG] [2023-12-26T22:11:56Z] [00007f9781ffb700] [S3Client] - id=0x2b08f80 Client shutdown progress: starting_destroy_executing=0 body_streaming_elg_allocated=0 process_work_task_scheduled=0 process_work_task_in_progress=0 num_endpoints_allocated=0 s3express_provider_active=0 finish_destroy=1
[DEBUG] [2023-12-26T22:11:56Z] [00007f9781ffb700] [S3Client] - id=0x2b08f80 Client finishing destruction.
[DEBUG] [2023-12-26T22:11:56Z] [00007f9781ffb700] [channel-bootstrap] - id=0x2956810: releasing bootstrap reference
The full log includes tokens and other senstive information and I cannot share them.
Is the S3 bucket you're using a directory bucket?
Yes, the S3 path is s3://bucket_name/dir_name
The issue is that you're using the wrong region. Due to a regression in the CRT that the team is aware of, certain instance types require that you use the correct region for the bucket you're attempting to access. If you change the region you're making the request from to match the bucket, it should work. Please let me know how that goes for you.
@RyanFitzSimmonsAK : Yes I understand that I am downloading from a Bucket in a diffrent region, but this is quite a reasonable use case and all previous versions of the aws-cli
work fine on those instances that we see issue with the latest version. Our current workaround solution is that we are using an old version of aws-cli
.
while we circumvented this by moving the S3 bucket to the correct region when using container credentials we still encountered issues where for instance ls commands work while cp commands throw errors
richard@ip-10-0-133-32:~$ aws --version
aws-cli/2.15.55 Python/3.11.8 Linux/5.15.0-1058-aws exe/x86_64.ubuntu.20
richard@ip-10-0-133-32:~$ aws s3 cp s3://s-harmonai-west/datasets/songs_raw/songs_md_2/train/songs-md-005281.tar .
30 (AWS_ERROR_PRIORITY_QUEUE_EMPTY): Attempt to pop an item from an empty queue.
richard@ip-10-0-133-32:~$ aws s3 cp s3://s-harmonai-west/datasets/songs_raw/songs_md_2/train/songs-md-005281.tar . --region us-west-2
30 (AWS_ERROR_PRIORITY_QUEUE_EMPTY): Attempt to pop an item from an empty queue.
richard@ip-10-0-133-32:~$ aws s3 ls s3://s-harmonai-west
PRE /
PRE checkpoints/
PRE checpoints/
PRE datasets/
PRE flavio/
PRE million_song_dataset/
PRE shawley/
PRE unprocessed/
PRE zqevans/
I am having the same issue here on P5.48xlarge. I am able to run aws s3 ls
on the bucket but get the error AWS_ERROR_S3_INVALID_RESPONSE_STATUS: Invalid response status from request
when try to copy a file into the bucket. The EC2 instance is in eu-nort-1
, while the bucket is in eu-central-1
.
Unfortunately, this isn't something the CLI team is able to address. The recommended workaround at this time is to disable the use of the CRT transfer client. To do this, set the preferred_transfer_client
to classic
. Please let me know if this workaround does / does not work for you.
Describe the bug
When installing the AWS CLI version 2.14 and above the
aws s3 sync s3:/buket_name local_path
command fails in Linux on P5.48xlarge instances with this errorSwitching to version 2.13.39 or lower resolved the issue
Expected Behavior
No failure in s3 sync
Current Behavior
the
aws s3 sync s3:/buket_name local_path
command fails in Linux on P5.48xlarge instances with this errorReproduction Steps
docker run -it rayproject/ray:2.9.0-py310-cu118 /bin/bash
aws s3 sync s3://buket /tmp/data
Possible Solution
No response
Additional Information/Context
No response
CLI version used
2.14 and above
Environment details (OS name and version, etc.)
Linux