Closed richterdavid closed 1 month ago
This is probably fine but I haven't run local tests yet. The instructions I found for testing are problematic. https://github.com/openzim/python-scraperlib/issues/153
See #155 ; I don't think you've implemented this in the right place, but let's wait a bit for discussions to settle
won't work. this custom logger is not configured to display debug why creating a logger from scratch?
With respect to accessing and configuring the logger, should everything in scraperlib be using this? https://github.com/openzim/python-scraperlib/blob/7d498319baadba715316c15cf9857ff2f6974a00/src/zimscraperlib/__init__.py#L11
It's evident that folks want unvalidated metadata logged, and presumably referenced by validation failures later. Consider: "Validation of Scrape URL failed" vs "Validation of something failed". Even better f"Validation of Scrape URL failed: {value}" so you don't need to look back to the earlier logging of the relevant metadata.
how will this be toggled?
Ideas?
Do we want this key-value log without context?
Context makes sense in the new proposal to log all metadata early in start()
. Given how metadata is added incrementally in this class it didn't make much sense here.
Do we want to print validated metadata only?
Current proposals say no.
Okay, I'm not going to work on this branch further; once there's a plan I'll send a new PR for issue #155
Add logging for validated metadata per discussion on https://github.com/openzim/warc2zim/issues/123 and https://github.com/openzim/warc2zim/pull/233