Open kaituo opened 8 months ago
file metadata path is Spark SQL existing feature, no extra grammer change required. for example
SELECT _metadata.file_name, count(*)
FROM alb_logs
WHERE _metadata.file_path like '%2023/11/09%'
GROUP BY _metadata.file_name
Is your feature request related to a problem? A user may have a messy S3 file system and would like to exclude certain unstructured log types which are in their S3 bucket. Glue offers a way to exclude, but that is Hive functionality. I suspect we will need to upgrade our SQL grammar to support advanced filtering.
What solution would you like? Possible approaches: