getmoto / moto

A library that allows you to easily mock out tests based on AWS infrastructure.
http://docs.getmoto.org/en/latest/
Apache License 2.0
7.66k stars 2.06k forks source link

S3 "select_object_content" Bytes data returns hardcoded values #8291

Open kbattle-autotec opened 2 weeks ago

kbattle-autotec commented 2 weeks ago

I am currently using moto 5.0.18, boto3 1.35.54 (but I believe this has not changed as of 5.0.20) and am running into inaccurate byte data while unit testing:

I know that the select_object_content operation is experimental, but the following fields in the response always return the same values, regardless of input.

It would be nice to receive accurate byte data based on the data provided. This way, you can verify that if there is an expected amount of data to be processed by s3, that it actually processes all the data. This can also help in test cases where you want to check that no data is processed in an empty file.

After looking through the package, it appears that these values are set in moto/s3/select_object_content.py/_create_stats_message line: 37 Screenshot 2024-11-04 at 1 12 41 PM

Current Output {'Stats': {'Details': {'BytesScanned': 24, 'BytesProcessed': 24, 'BytesReturned': 22}}}

Expected Output {'Stats': {'Details': {'BytesScanned': value of scanned bytes, 'BytesProcessed': value of processed bytes, 'BytesReturned': value of return bytes}}}

bblommers commented 1 week ago

Hi @kbattle-autotec, welcome to Moto!

The values that we currently return were hardcoded out of laziness, but I agree that it can be useful to see the actual data. Marking it as an enhancement.