elastic / crawler

Other
104 stars 5 forks source link

Add timestamps to system logger #69

Open navarone-feekery opened 2 months ago

navarone-feekery commented 2 months ago

Problem Description

The system logger doesn't log timestamps at all. We should add timestamps.

Proposed Solution

The current format is [crawl_id][crawl_stage] logs here, and is set here in the codebase.

It outputs logs like this:

[crawl:66a8f9bf1016c9206ee456a3] [primary] Crawling site...

The timestamp should be appended after crawl_stage, so something like

[crawl:66a8f9bf1016c9206ee456a3] [primary] [2024-07-30T12:00:00Z000] Crawling site...
yashathwani commented 1 month ago

Can i work on this issue ? Could you please assign this to me?

navarone-feekery commented 1 month ago

Hi @yashathwani, sorry for the late response. Sure if you're still interested you're welcome to work on this issue 🙂 Let me know if you are still interested.

yashathwani commented 1 month ago

Hi @navarone-feekery , thank you for getting back to me! I'm definitely still interested in working on this issue.

navarone-feekery commented 1 month ago

@yashathwani great! Apologies if you're already aware, but you will need to run this locally to develop on it. There's instructions to set it up in the main README.md: https://github.com/elastic/crawler/blob/main/README.md#running-open-crawler-from-source

If you don't have an Elasticsearch instance to test against, you can set the output_sink to console. The system logger works the same regardless of the output sink.

If you have any questions or run into trouble getting the project running, let me know.