fsspec / s3fs

S3 Filesystem
http://s3fs.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
870 stars 271 forks source link

Writing metadata with underscores fail silently #878

Open jbsilva opened 4 months ago

jbsilva commented 4 months ago

Writing metadata containing _ is ignored without warnings.

More appropriate options would be:

  1. Allow underscores, as they are valid in AWS S3 (.getxattr() can't get them, as it performs a .replace("_", "-"))
  2. Warn or raise an exception
  3. Convert to -
file_bytes = b'My file...'
metadata = {"deviceId": 'LM1', 'device_serial': 'A1234', 'device-name': 'My device'}

fs= S3FileSystem()
fs.pipe_file("bucket/key", file_bytes, Metadata=metadata)

fs.metadata("bucket/key")
>> {'deviceid': 'LM1', 'device-name': 'My device'}

As you can see, the device_serial metadata was not written.

I can confirm this by looking in the S3 bucket or with boto3:

import boto3
s3_client = boto3.client('s3')
s3_client.head_object(Bucket="bucket", Key="key")["Metadata"]
>> {'deviceid': 'LM1', 'device-name': 'My device'}
martindurant commented 4 months ago

Can you make PR to allow these keys, if you know how? I assume getxattr would need to be changed.