Norconex / collector-core

Collector-related code shared between different collector implementations
http://www.norconex.com/collectors/collector-core/
Apache License 2.0
7 stars 15 forks source link

Enrich configuration loader validation messages with context #26

Open mariuspruski opened 5 years ago

mariuspruski commented 5 years ago

Context The mechanism that loads crawler configurations runs a validator over the provided XML which will warn the developer about syntax errors. Unfortunately these warnings are missing any context and location information. In a large project, it is thus very bothersome to find out the exact cause of the warning.

Ideal solution It would be great if the warning messages could be qualified by a line number, file name, or even just the ID of the crawler config in which they occur.

Suggestion My workaround so far is to increase the log level of Norconex to DEBUG in order to see the information "Crawler configuration loaded: x" subsequent to the warnings, which helps me to localize the issue. However, on the DEBUG level there is far too much noise being printed out.

(1). My first suggestion would be to print the mentioned "Crawler configuration loaded: x" message (CrawlerConfigLoader.java, line 83) already on the INFO level, as I find this information much more important than other messages on the DEBUG level.

(2). Furthermore, I suggest to change the following log messages of the AbstractCrawlerConfig and the collector-http module to be printed on the DEBUG level, as they appear to be less important than the loading of an entire crawler configuration:

If we agree on a solution, I can submit a PR containing the relevant changes.

essiembre commented 5 years ago

Feel free to submit a PR. If you are just adjusting what gets logged a bit, I do not foresee any issues.