taspinar / twitterscraper

Scrape Twitter for Tweets
MIT License
2.39k stars 579 forks source link

Rework logging management #261

Closed LinqLover closed 4 years ago

LinqLover commented 4 years ago

See #260. Please let me know if further formalities are required!

Changelog

LinqLover commented 4 years ago

@taspinar Would you mind to merge or review? :-)

simonm3 commented 4 years ago

Best practice is that the application should not set the logging level but just use its own logger or loggers. Then the user can configure their own logging including loglevel for each logger. This keeps logging independent of applications and enables you to route messages by application.

For example set at the top of each file:

log=logging.getLogger(__file__)

Note also the user may be calling other functions apart from this one. I doubt you want to add the log level as a parameter to each function.

LinqLover commented 4 years ago

Fair point. Can you explain your proposal more in detail? Do we have to change the twitterscraper for this at all?

simonm3 commented 4 years ago
LinqLover commented 4 years ago

Thanks for the clarification.

  • change any log = getLogger() statements to: log = getLogger(__file__)

Are you sure about this? Wouldn't it be better to have one logger for the whole twitterscraper module? If I try this, I cannot specify the log level for twitterscraper from the outside:

logging.getLogger('twitterscraper').level = logging.DEBUG
logging.getLogger().level = logging.INFO
list_of_tweets = query_tweets("Trump OR Clinton", 10)

Then pdb reveals that logger.level in query.py is set to 0.

PS: Please look at the code, I pushed a working commit.

simonm3 commented 4 years ago

AHHHH. Sorry I meant name

You could choose to use just "scraper". The advantage of name is that you have slightly more control e.g. could turn on logging for scraper or scraper.search.

On Fri, 20 Mar 2020 at 11:58, Christoph Thiede notifications@github.com wrote:

Thanks for the clarification.

  • change any log = getLogger() statements to: log = getLogger(file)

Are you sure about this? Wouldn't it be better to have one logger for the whole twitterscraper module? If I try this, I cannot specify the log level for twitterscraper from the outside:

logging.getLogger('twitterscraper').level = logging.DEBUG logging.getLogger().level = logging.INFO list_of_tweets = query_tweets("Trump OR Clinton", 10)

Then pdb reveals that logger.level in query.py is set to 0.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/taspinar/twitterscraper/pull/261#issuecomment-601663880, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJE32PL62Y2R4FO525R2ZDRINK77ANCNFSM4KQ5IK5Q .

LinqLover commented 4 years ago

Ah, I see! But this would still require every client to set the log level for each single twitterscraper file? Personally, I think this in an overhead for a project of this size.

simonm3 commented 4 years ago

You can set it at module or submodule level. So if there were ts.seach and ts.users then you can getLogger("ts").setLevel(logging.WARNING) to silence both of them.

However using just the one logger is also fine. Any user can configure the log message to include filename if they want it.

On Fri, 20 Mar 2020 at 17:02, Christoph Thiede notifications@github.com wrote:

Ah, I see! But this would still require every client to set the log level for each single twitterscraper file? Personally, I think this in an overhead for a project of this size.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/taspinar/twitterscraper/pull/261#issuecomment-601809314, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJE32IDDHWV2YMXWZJAA63RIOOT7ANCNFSM4KQ5IK5Q .

LinqLover commented 4 years ago

Well, what we could do would be to declare logger in each file as logger = logging.getLogger(f'twitterscraper.{__name__}'). That sounds interesting ...

simonm3 commented 4 years ago

name already does that so you don't have to.

On Fri, 20 Mar 2020, 20:41 Christoph Thiede, notifications@github.com wrote:

Well, what we could do would be to declare logger in each file as logger = logging.getLogger(f'twitterscraper.{name}'). That sounds interesting ...

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/taspinar/twitterscraper/pull/261#issuecomment-601901660, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJE32PLF7H3E4JX3IY6E3LRIPIIJANCNFSM4KQ5IK5Q .

LinqLover commented 4 years ago

Hm, I came to the conclusion that nested logger names don't scale for this project. We have basically two files that use loggers at all, main and query, of which main only sends a single message to the logger. For this reason, I think this PR does not need further modification. Would you agree? :)

simonm3 commented 4 years ago

sounds good to me.

On Sat, 21 Mar 2020 at 14:29, Christoph Thiede notifications@github.com wrote:

Hm, I came to the conclusion that nested logger names don't scale for this project. We have basically two files that use loggers at all, main and query, of which main only sends a single message to the logger. For this reason, I think this PR does not need further modification. Would you agree? :)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/taspinar/twitterscraper/pull/261#issuecomment-602051125, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJE32NYGE6QQQBZHC3T3I3RITFMBANCNFSM4KQ5IK5Q .

taspinar commented 4 years ago

Looks like an very good addition twitterscraper. Thanks for this MR.

LinqLover commented 4 years ago

Thanks a lot for merging!