awslabs / project-lakechain

:zap: Cloud-native, AI-powered, document processing pipelines on AWS.
https://awslabs.github.io/project-lakechain/
Apache License 2.0
79 stars 16 forks source link

Bug: Transcribe processor does not enrich metadata #14

Closed HQarroum closed 2 months ago

HQarroum commented 5 months ago

Expected Behaviour

The transcribe audio processor does not correctly enrich the output document metadata with the detected language of the text.

Current Behaviour

No metadata are specified at the output of the transcribed document

Code snippet

No response

Steps to Reproduce

No response

Possible Solution

No response

Project Lakechain version

latest

Execution logs

No response