counterdata-network / story-processor

Story discovery engine for the Counterdata Network. Grabs relevant stories from various APIs, runs them against bespoke classifier models, post results to a central server.
Apache License 2.0
0 stars 2 forks source link

check logging configuration #38

Closed rahulbot closed 1 year ago

rahulbot commented 1 year ago

Our logging levels aren't being applied right. In the processor.__init__.py we set default level to INFO, and there's a lot of code trying to silence things, but it seems a bit haphazard. We should revisit this and clean it up.

For instance, we get DEBUG out of mcmedata: 14:13:35.389 | DEBUG | mcmetadata.languages - Language mismatch - indicated en but guessed pt

And there is lots of noisy stuff out of trafilatura: 14:13:11.834 | WARNING | trafilatura.metadata - error in sitename extraction: string index out of range 2023-10-16 14:13:11 [trafilatura.metadata] WARNING: error in sitename extraction: string index out of range 14:13:11.949 | WARNING | trafilatura.metadata - error in sitename extraction: string index out of range

rahulbot commented 1 year ago

Maybe scrapy config parameter LOG_ENABLED is part of this? https://docs.scrapy.org/en/latest/topics/logging.html