PublicDataWorks / verdad-frontend

MIT License
1 stars 0 forks source link

[Backend] Monitor Audio Processing Pipeline and Generate Summary Report with LLM Cost Tracking #127

Open nhphong opened 2 weeks ago

nhphong commented 2 weeks ago

As a developer, I want to monitor the audio processing pipeline and generate a detailed summary report of processing statistics, including error analysis and LLM cost tracking, so that we can identify issues, improve efficiency, and manage costs effectively.

Acceptance Criteria:

  1. Pipeline Monitoring:
    • Monitor the audio processing pipeline to check the flow and status of audio files through each stage.
    • Capture detailed statistics, including the number of files processed, errors encountered, and success metrics at each stage.
  2. Error Categorization:
    • Categorize errors by type and frequency for each stage in the pipeline.
    • Identify which errors are retryable and which require further investigation.
  3. Disinformation Detection Summary:
    • Report the number of audio files identified as containing disinformation versus those that do not at each stage.
    • Provide a breakdown of snippets extracted and analyzed in subsequent stages.
  4. LLM Cost Tracking:
    • Track the costs associated with using LLM services (Gemini Flash/Pro and OpenAI Whisper) throughout the audio processing pipeline.
    • Include cost metrics in the summary report to provide insights into resource usage and financial impact.
  5. Summary Report Generation:
    • Generate a comprehensive summary report that includes:
      • Total audio files processed
      • Number and types of errors encountered
      • Disinformation detection results
      • Snippet extraction and analysis outcomes
      • LLM usage costs
    • Ensure the report is clear, concise, and suitable for presentation to stakeholders.
  6. Automation and Scheduling:
    • Automate the generation of the summary report on a regular basis (e.g., daily).
    • Ensure the report is easily accessible to relevant team members and stakeholders.

Tasks:

Notes:

linear[bot] commented 2 weeks ago

VER-165 [Backend] Monitor Audio Processing Pipeline and Generate Summary Report with LLM Cost Tracking