ChatGPTNextWeb / ChatGPT-Next-Web

A cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS). 一键拥有你自己的跨平台 ChatGPT/Gemini 应用。
https://app.nextchat.dev/
MIT License
75.43k stars 58.87k forks source link

[Feature Request] Support for PDF, TXT, audio files, and URL links for data analysis and transcription #4734

Closed rdbox closed 4 months ago

rdbox commented 4 months ago

Problem Description

Currently, there is no ability to upload and analyze files in PDF, TXT formats, audio files, or URL links. Additionally, there is no support for voice input via microphone for transcription. This limits the functionality and usability of the system for users who need to analyze data from various sources and formats.

Solution Description

Add support for uploading and analyzing files in PDF, TXT formats, audio files, and URL links. Additionally, implement support for voice input via microphone for real-time transcription. The feature should include:

  1. The ability to upload PDF files and convert them to text for analysis.
  2. Analysis of text from TXT files.
  3. Extraction and analysis of text from web pages provided via URL links.
  4. Upload and analysis of audio files with automatic transcription.
  5. Real-time transcription of voice input via microphone.

This functionality will greatly expand the system's capabilities and improve the user experience.

Alternatives Considered

  1. Using third-party tools to convert PDFs to text before uploading to the system.
  2. Manually copying text from PDFs and pasting it into the system for analysis.
  3. Using external parsers to extract text from web pages.
  4. Using separate transcription services for audio files and voice input.

However, all these alternatives require additional effort from users and decrease the convenience of using the system.

Additional Context

Adding this functionality will allow users to more effectively use the system for analyzing data from various sources and formats. This is especially important for those who work with documents, web pages, and audio recordings that require automated analysis and transcription.

Use case example: A user uploads a report in PDF format or a text file with data, provides a URL link to a web page with information, or uploads an audio file for transcription. The system automatically extracts the text and analyzes it, providing results to the user. Additionally, users can speak into a microphone to transcribe their speech in real-time.

This feature could also include support for various languages and document formats, making the system more versatile.

When is it planned to add the ability to work with PDF, TXT, audio files, URL links, and voice input via microphone? This would significantly enhance the functionality of the system.

Dean-YZG commented 4 months ago

Thank you for your great ideal. Since these functions require the storage capacity of data, we have plans to gradually build the underlying capacity to achieve functions such as file uploading