kagisearch / smallweb

Kagi Small Web
https://kagi.com/smallweb
MIT License
474 stars 247 forks source link

Kagi Small Web

Kagi Small Web is an initiative by Kagi.

Kagi's mission is to humanize the web and this project is built to help surface recent results from the small web - people and stories that typically zip by in legacy search engines. Read more about it in the announcement blog post.

Few things to note:

Criteria for posts to show on the website

If the blog is included in small web feed list (which means it has content in English, it is informational/educational by nature and it is not trying to sell anything) we check for these two things to show it on the site:

Guidelines for adding a site or channel to the list

Add a new personal blog RSS feed to the list. Rules:

Add website RSS feed

Hint: To extract the RSS link from a YouTube channel, you can use this tool.

Add YouTube channel RSS feed

Remove a site or a channel

Remove a website if :

Clicking "Remove website" will edit small web list in new tab, where you can locate and remove the website feed in question. Make sure to add in comments the reason for removal.

Remove website

Remove channel

Small web is beautiful

What is Small Web exactly? Recommend reading:

Info

smallweb.txt - Contains the feeds of indexed blogs

smallyt.txt - Contains the feeds of indexed YouTube channels

yt_rejected.txt - Contains the list of YouTube channels that were reviewed (in an automated way) and rejected

app/ - App powering the Kagi Small Web website

Sources

Small web

The original list of small web blogs has been assembled from various sources including:

YouTube channels

The seed list for YouTube channels has been assembled from these HN discussions.

Useful commands

Show duplicate domains:

awk -F/ '{print $3}' smallweb.txt | sort | uniq -d | while read domain; do echo "$domain"; grep "$domain" smallweb.txt; echo ""; done