awslabs / mountpoint-s3

A simple, high-throughput file client for mounting an Amazon S3 bucket as a local file system.
Apache License 2.0
4.48k stars 154 forks source link

Error with aria2c: Failed to write into the segment file #1037

Open brianloyal opened 4 days ago

brianloyal commented 4 days ago

Mountpoint for Amazon S3 version

mount-s3 1.9.1

AWS Region

No response

Describe the running environment

Running on EC2 Ubuntu 22 (AMI ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20240701)

Mountpoint options

mount-s3 <BUCKET_NAME> $HOME/s3 --allow-delete --allow-overwrite

What happened?

Tried downloading 5.2GiB file directly to S3 via mountpoint with aria2c

aria2c https://storage.googleapis.com/alphafold/alphafold_params_2022-12-06.tar -d $HOME/s3

Generates error:

09/27 19:40:17 [NOTICE] Downloading 1 item(s)

09/27 19:40:17 [NOTICE] Allocating disk space. Use --file-allocation=none to disable it. See --file-allocation option in man page for more details.
[#0ab2ec 0B/5.2GiB(0%) CN:1 DL:0B] [FileAlloc:#0ab2ec 4.4GiB/5.2GiB(86%)]                                                                          
09/27 19:40:25 [ERROR] Exception caught
Exception: [DefaultBtProgressInfoFile.cc:220] errorCode=1 Failed to write into the segment file s3/alphafold_params_2022-12-06.tar.aria2
[#0ab2ec 0B/5.2GiB(0%) CN:1 DL:0B]                                                                                                                 
09/27 19:40:25 [ERROR] Error when trying to flush write cache
Exception: [AbstractDiskWriter.cc:459] errNum=5 errorCode=17 Failed to write into the file s3/alphafold_params_2022-12-06.tar, cause: Input/output error

09/27 19:40:25 [ERROR] CUID#7 - Download aborted. URI=https://storage.googleapis.com/alphafold/alphafold_params_2022-12-06.tar
Exception: [AbstractCommand.cc:403] errorCode=17 URI=https://storage.googleapis.com/alphafold/alphafold_params_2022-12-06.tar
  -> [DownloadCommand.cc:124] errorCode=17 Write disk cache flush failure index=0

09/27 19:40:25 [NOTICE] Download GID#0ab2ece0458de8f7 not complete: s3/alphafold_params_2022-12-06.tar

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
0ab2ec|ERR |   415KiB/s|s3/alphafold_params_2022-12-06.tar

Status Legend:
(ERR):error occurred.

aria2 will resume download if the transfer is restarted.
If there are any errors, then see the log file. See '-l' option in help/man page for details.

NOTE: The following two commands both complete successfully:

  1. wget https://storage.googleapis.com/alphafold/alphafold_params_2022-12-06.tar -P s3
  2. aria2c https://storage.googleapis.com/alphafold/alphafold_params_2022-12-06.tar

In other words, I can use wget to download to the mounted folder, and aria2 to download to ec2, but NOT aria2 to download to the mounted folder

Relevant log output

No response

dannycjones commented 1 day ago

Hi @brianloyal,

Mountpoint logs should be telling you why the write failed, I'd start there by trying the same command with aria2 and reviewing the Mountpoint logs. https://github.com/awslabs/mountpoint-s3/blob/main/doc/LOGGING.md

Ultimately, I suspect that aria2 may be writing at arbitrary offsets rather than sequentially, which is not supported by Mountpoint as this does not map well to object APIs.

I don't actually think this is the issue but I also see this message in your aria2 error output: "Allocating disk space. Use --file-allocation=none to disable it.". Mountpoint won't be able to support that, so I'd recommend using --file-allocation=none. Give it a try just in case this changes some behaviors on writing.