Multiverse-of-Projects / NewsAI

A dynamic NewsAI dashboard that uses NLP to analyze news articles, visualize sentiment trends, and extract insights through interactive data visualizations.
https://news-ai-dashboard.streamlit.app/
GNU General Public License v3.0
18 stars 30 forks source link

Generate Sitemap.xml and Robots.txt to improve SEO and Site Crawling #153

Open amiya-cyber opened 2 weeks ago

amiya-cyber commented 2 weeks ago

Adding sitemap.xml and robots.txt files helps optimize a website for search engines.

Sitemap.xml provides a list of important URLs, helping search engines discover, crawl, and index new and updated content faster, improving SEO.

Robots.txt controls which parts of the site search engines can access, preventing indexing of sensitive or unimportant pages, conserving crawl budget, and reducing server load.

Together, they enhance site visibility, performance, and search engine ranking.

ASSIGN WITH LEVEL 2 OR LEVEL 3

Devasy23 commented 2 weeks ago

@amiya-cyber If you have seen the code in depth then you might know that we don't crawl the web, we just use URLs provided by the news aggregator API, and scrape them, so if you are looking to improve scraping then you're most welcome