datafusion-contrib / datafusion-objectstore-s3

S3 as an ObjectStore for DataFusion
Apache License 2.0
59 stars 13 forks source link

Improve Testing #24

Closed matthewmturner closed 2 years ago

matthewmturner commented 2 years ago

Add testing for the below cases.

Bad Data

DataFusion Integration

matthewmturner commented 2 years ago

@seddonm1 i was thinking of adding tests for the above. Do you think this makes sense? I dont have enough experience with implementing object store readers to know if there could file type / partition specific issues that we need to look out for.

I think the bad data we should test for either way though.

matthewmturner commented 2 years ago

@houqp also curious if you have any thoughts on ways we could improve testing.

seddonm1 commented 2 years ago

Yes, testing for bad data makes sense. The call to the API is pretty simple and mainly in the AWS SDK so I wouldn't expect too many interesting things.

I would just advise against adding too much testing for specific file types as they may change behaviour upstream (DataFusion/Arrow) which means a lot of maintenance work as the APIs become more stable.

matthewmturner commented 2 years ago

@seddonm1 ive trimmed down the list. let me know if you think anything missing