openzim / zimit

Make a ZIM file from any Web site and surf offline!
GNU General Public License v3.0
307 stars 24 forks source link

Are search bars standardized code (and thus automatically removable)? #340

Open Popolechien opened 1 month ago

Popolechien commented 1 month ago

I'm looking at https://dev.library.kiwix.org/viewer#devhints.io_en_all which seems to have been nicely scraped by zimit, but has a massive search bar in the middle of the page. Obviously it does not work, and since this is a feature we're bound to find in many a website I am wondering if we could have a zimit feature that automatically identifies and removes anything looking like a search bar (or at least have it work on enough different sites that it is worth the effort).

benoit74 commented 1 month ago

There is not such thing as standard search bar AFAIK. We could however try to detect them automatically with a bit of might and magic, but I'm concerned by the fact that this is probably going to produce more harm than good.

Developing a custom CSS is pretty easy/fast for a developer (or when it is hard, it is because the CSS is hard to develop and an automated logic would have failed as well), and we could even train openZIM content team to test some prepared CSS which might remove search boxes in some cases, so that you do not depend on us for every website.

I keep this issue open since the point is still relevant, highly demanded, and I might miss something.