yashasvini121 / predictive-calc

An interactive web application developed with Streamlit, designed for making predictions using various machine learning models. The app dynamically generates forms and pages from JSON configuration files. ⭐ If you found this helpful, consider starring the repo!
https://predictive-calc.streamlit.app/
MIT License
25 stars 72 forks source link

News Articles Category Prediction #108

Open pratikwayal01 opened 1 month ago

pratikwayal01 commented 1 month ago

Feature Request: News Articles Category Prediction

Summary: A feature to predict the category of a news article (such as Sports, Politics, Entertainment, Technology, etc.) based on its content. This will enhance the platform's ability to automatically categorize news content and make it easier for users to find relevant articles.

Details:

  1. Objective: Implement a model that can predict the category of a news article using its text. The model should be able to handle various article topics and assign categories like:

    • Politics
    • Sports
    • Entertainment
    • Technology
    • Science
    • Health
    • Business
    • World
  2. Requirements:

    • A dataset of categorized news articles for training the model.
    • A pre-trained Natural Language Processing (NLP) model for text classification (such as BERT, GPT, or similar models).
    • Ability to integrate the model with the existing backend, ensuring real-time predictions when new articles are uploaded or written.
  3. Proposed Approach:

    • Preprocess the articles by removing unnecessary metadata, cleaning text, and tokenization.
    • Train an NLP-based classifier (e.g., Logistic Regression, Random Forest, or a deep learning model like a Transformer) on the labeled dataset of news articles.
    • Evaluate the model’s accuracy using standard metrics like F1-score, precision, and recall.
    • Integrate the trained model into the system, enabling automatic categorization of articles at the time of submission.
    • Provide an API for querying the model to get category predictions.
  4. Benefits:

    • Improves user experience by categorizing articles accurately and automatically.
    • Enhances the discoverability of news articles based on categories.
    • Reduces manual effort in tagging and organizing articles.
  5. Possible Challenges:

    • Data availability: Large and diverse datasets are required for accurate prediction across multiple categories.
    • Handling ambiguous content that may fit into multiple categories.

Priority: Medium


Teja-m9 commented 1 month ago

Hey @yashasvini121 please assign this to me

yashasvini121 commented 1 month ago

What will the final form include? Will the user be able to upload the article text, and will the model categorize it?

pratikwayal01 commented 1 month ago

I was just creating a model but if @yashasvini121 want I can deploy it on streamlit.

yashasvini121 commented 1 month ago

Yes, you need to create a page for it as well. Ensure that your work aligns with the project structure.