algolia / docsearch-scraper

DocSearch - Scraper
https://docsearch.algolia.com/
Other
308 stars 107 forks source link

Output CI-friendly progress messages #422

Open mojavelinux opened 5 years ago

mojavelinux commented 5 years ago

Currently, the scraper assumes it's writing progress messages to an ANSI-compatible terminal. As a result, the progress messages look like this in a CI environment:

[94m> DocSearch: [0mhttps://docs.couchbase.com/server/6.0/introduction/intro.html ([93m51 records[0m)
[94m> DocSearch: [0mhttps://docs.couchbase.com/home/contribute/includes.html ([93m23 records[0m)
[94m> DocSearch: [0mhttps://docs.couchbase.com/server/6.0/n1ql/n1ql-language-reference/index.html ([93m28 records[0m)

Either add an option to output plain messages or automatically detect if ANSI color codes are not supported.

The easiest way to accomplish this might be to route the messages through a logger which can be configured separately. I'd also be interested in silencing the messages completely, which a logger would also help with.

mojavelinux commented 5 years ago

I should note that not all CI environments have this problem. For instance, GitLab CI is capable of showing ANSI color codes. Jenkins, on the other hand, is not.

s-pace commented 5 years ago

Having a proper logger is one of our objective at some point. No ETA so far, we will solve this while moving our codebase to a proper python v3/scrapy integration.

mojavelinux commented 5 years ago

:+1:

If you need help, don't hesitate to ask. I'll be using docsearch for the foreseeable future, so I'll be around.

s-pace commented 5 years ago

Thanks, send us an email docsearch@algolia.com, we have a small gift for you :)