openzim / zimfarm

Farm operated by bots to grow and harvest new zim files
https://farm.openzim.org
GNU General Public License v3.0
84 stars 25 forks source link

Create documentation for content editors about zimit offliner #860

Open benoit74 opened 1 year ago

benoit74 commented 1 year ago

Just like we have done recently for Youtube offliner, we need to create a documentation for Zimit offliner.

Current documentation about Zimit Ticket Lifecycle is mostly useless since it only reproduces generic documentation, so from my PoV this should be removed and centralized in a "Generic Recipe Lifecycle" (which could also be part of the "Overall content edition process").

We discussed a bit about it Friday and agreed that many content has already been produced in various explanation in various tickets, so I need help to gather this.

I also realized yesterday that documentation about scope, include, exclude is not that bad on Browsertrix Crawler README. Even if not really easy to follow, at least there are important information detailed.

Would you please help me to gather all existing documentation or explain/write what needs to be?

rgaudin commented 1 year ago

See https://github.com/openzim/zimit/issues/138

rgaudin commented 12 months ago

And https://github.com/webrecorder/browsertrix-crawler/blob/main/README.md#crawl-scope----configuring-pages-included-or-excluded-from-a-crawl