kagisearch / smallweb

Kagi Small Web
https://kagi.com/smallweb
MIT License
519 stars 255 forks source link

Update smallweb.txt #285

Closed RonanCJ closed 2 months ago

RonanCJ commented 2 months ago

Added link to a personal blog about computer-aided design, which also covers topics as varied as European clothes dryers and living car-free (or not) in Germany.

vprelovac commented 2 months ago

The blog has ads which is not allowed per our guidelines.

RonanCJ commented 2 months ago

@vprelovac I was under the impression that those guidelines no longer applied, since a quick sampling of 5 random blogs on Kagi's Small Web site showed that most had ads (as detected by uBlock Origin). For example: https://kagi.com/smallweb/?url=https://atomicjunkshop.com/dc-gets-hip/ https://kagi.com/smallweb/?url=https://lessthan1000followers.com/2024/08/23/vhexpertines-altpop-infused-industrial-landscapes-of-dark-new-single-laura-palmer/& https://kagi.com/smallweb/?url=https://javahippie.net/java/spring/2024/08/15/testing-localdatetime-now-in-java.html& https://kagi.com/smallweb/?url=https://www.joshbeckman.org/notes/760353675&

Can you clarify if the "no ads" guideline applies universally, and if so, why many sites in this database seem to have ads?

vprelovac commented 2 months ago

Most have been added in bulk from sources added in readme. I did not have time to check them all.

People submit sites for removal when they find ads.

Btw I do not see any ads here for example https://kagi.com/smallweb/?url=https://www.joshbeckman.org/notes/760353675&

RonanCJ commented 2 months ago

I see, I was not aware that most sites were added from sources curated by other internet users.

uBlock Origin detected a tracking/analytic link in the Josh Beckman blog, which is probably why it said "ads" were detected and blocked. As an ad block user, websites end up looking the same whether trackers are blocked or ads are blocked.

https://i.imgur.com/QfU4kMq.png

I believe Kagi's Small Web is much more valuable to paying customers and API users by having high-quality personal blogs that happen to have some blockable ads, since doing so directly supports Kagi's core purpose "to inform and educate, empowering users with knowledge and understanding in their digital journey." I also note that "blogs must be ad-free" is not a criterion in the Small Web announcement post https://blog.kagi.com/small-web, which makes sense. From personal experience, hosting a blog is not free, and the more traffic your personal blog receives, the more expensive it is to host and the more owners are pushed to look at ads and donations.

However, if the no-ads rule trumps all other considerations, this database could be made compliant by removing hundreds or thousands of websites from the Small Web database. It would be unfortunate to make the small web less accessible and push Kagi search results closer to those of mainstream engines, but I understand that you need to make tradeoffs between competing values.

vprelovac commented 2 months ago

What uBlock detects is broader than ads, and we are concerned only that the website has no ads. Content surrounded by ads is almost never worth reading.

I would assume that not many websites that have ads slipped through and are in the database. If you find them you can submit a PR to remove them. Also Small Web has only about 15,000 sites total because of the high bar we have for inclusion.

RonanCJ commented 2 months ago

After spending more time going through posts, you are most likely correct about actual ads in the database. While I assumed that the uBlock detections were mostly ads, I was clearly mistaken: out of 33 randomly presented posts in the Small Web portal, only 3 had ads show up with uBlock turned off.

The website I submitted would possibly annoy people who are used to the usually ad-free browsing experience in Kagi Small Web. Thus, I understand the decision to not include it and not add to the minority of sites with ads in the Small Web database (even though it is probably an exception to the principle about content surrounded by ads).

scottedwards2000 commented 2 months ago

Nice to see such a civil discussion :-) My 2c: if smallweb is meant to bring back the "glory" days of the internet (which i was around for), then I personally would be ok with limited ads. I remember seeing alot of sites with AdSense in a frame taking up the far right side of the screen (much different than so many "big web" sites today littered with animated and video ads).

vprelovac commented 2 months ago

with limited ads.

Define? 1, 2, ..5 ?