openzim / devdocs

devdocs.io to ZIM scraper
GNU General Public License v3.0
2 stars 0 forks source link
devdocs scraper zim

Devdocs scraper

This scraper downloads devdocs.io documentation databases and puts them in ZIM files, a clean and user friendly format for storing content for offline usage.

CodeFactor License: GPL v3 codecov PyPI version shields.io PyPI - Python Version Docker

Installation

There are three main ways to install and use devdocs2zim from most recommended to least:

Install using a pre-built container 1. Download the image using `docker`: ```sh docker pull ghcr.io/openzim/devdocs ```
Build your own container 1. Clone the repository locally: ```sh git clone https://github.com/openzim/devdocs.git && cd devdocs ``` 1. Build the image: ```sh docker build -t ghcr.io/openzim/devdocs . ```
Run the software locally using Hatch 1. Clone the repository locally: ```sh git clone https://github.com/openzim/devdocs.git && cd devdocs ``` 1. Install [Hatch](https://hatch.pypa.io/): ```sh pip3 install hatch ``` 1. Start a hatch shell to install software and dependencies in an isolated virtual environment. ```sh hatch shell ``` 1. Run the `devdocs2zim` command: ```sh devdocs2zim --help ```

Usage

[!WARNING] This project is still a work in progress and isn't ready for use yet, the commands below are examples only.

# Usage
docker run -v my_dir:/output ghcr.io/openzim/devdocs devdocs2zim [--all|--slug=SLUG|--first=N]

# Fetch all documents
docker run -v my_dir:/output ghcr.io/openzim/devdocs devdocs2zim --all

# Fetch all documents except Ansible
docker run -v my_dir:/output ghcr.io/openzim/devdocs devdocs2zim --all --skip-slug-regex "^ansible.*"

# Fetch Vue related documents
docker run -v my_dir:/output ghcr.io/openzim/devdocs devdocs2zim --slug vue~3 --slug vue_router~4

# Fetch the docs for the two most recent versions of each software
docker run -v my_dir:/output ghcr.io/openzim/devdocs devdocs2zim --first=2

One of the following flags is required:

Optional Flags:

Formatting Placeholders

The following formatting placeholders are supported:

Developing

Use the commands below to set up the project once:

# Install hatch if it isn't installed already.
❯ pip install hatch

# Local install (in default env) / re-sync packages
❯ hatch run pip list

# Set-up pre-commit
❯ pre-commit install

The following commands can be used to build and test the scraper:

# Show scripts
❯ hatch env show

# linting, testing, coverage, checking
❯ hatch run lint:all
❯ hatch run lint:fixall

# run tests on all matrixed' envs
❯ hatch run test:run

# run tests in a single matrixed' env
❯ hatch env run -e test -i py=3.12 coverage

# run static type checks
❯ hatch env run check:all

# building packages
❯ hatch build

Contributing

This project adheres to openZIM's Contribution Guidelines.

This project has implemented openZIM's Python bootstrap, conventions and policies v1.0.3.