apify / crawlee-python

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
https://crawlee.dev/python/
Apache License 2.0
4.65k stars 319 forks source link

Add an option for JSON-compatible logs #700

Open vdusek opened 1 week ago

vdusek commented 1 week ago

Description

Currently, Crawlee "statistics" logs are formatted as tables, which are human-readable but problematic when using JSON logs.

Solution

Introduce a Crawler's flag that outputs logs in a JSON-compatible format. This would allow users to toggle between "table" and JSON-compatible logs.

janbuchar commented 1 week ago

I remember successfully using https://github.com/madzak/python-json-logger before for the exact same purpose.