euagendas / m3inference

A deep learning system for demographic inference (gender, age, and individual/person) that was trained on massive Twitter dataset using profile images, screen names, names, and biographies
http://www.euagendas.org
GNU Affero General Public License v3.0
145 stars 57 forks source link

Optional Debugging #17

Closed Jaggler3 closed 3 years ago

Jaggler3 commented 3 years ago

Added a parameter to M3Twitter and M3Inference constructors to specify if logging should be shown.

For example: m3twitter = M3Twitter(cache_dir="twitter_cache", debug=False)

debug=False disables the logger, and passes debug to imported functions unaware of the M3Inference object.

These functions are listed as follows: In m3inference/utils.py:

In m3inference/preprocess.py:

zijwang commented 3 years ago

Hi @Jaggler3 , thanks for the PR. To better understand your case, could you let us know why you need a functionality to disable the logger?

Jaggler3 commented 3 years ago

Hi thanks for taking a look. I needed to disable logging to run this package as a Python-based Apache Storm bolt. The official Python bolt code communicates over stdio to Storm so the printing and progress bar started to cause some issues with that communication.

computermacgyver commented 3 years ago

Thanks, @Jaggler3 , for making these comments. I agree in general that it is good to have a way to disable debugging output. I think @zijwang , the main question I would have is simply whether we set debug=False or debug=True as the default if the option is not passed in. Choosing debug=True would ensure no change in behaviour to the current version, but it also feels like False is a good default in general. I'm happy for you to do whichever you feel best, @zijwang

zijwang commented 3 years ago

Thanks, @Jaggler3 and @computermacgyver !

@Jaggler3 : would adding the following two lines of code just work if you want to disable the logging message?

import logging
logging.basicConfig(level=logging.WARN)

Full log:

>>> import logging
>>> logging.basicConfig(level=logging.WARN)
>>> from m3inference import M3Inference
>>> m3 = M3Inference()
141007KB [01:47, 1310.88KB/s]
>>> pred = m3.infer('./m3inference/test/data_resized.jsonl')
Predicting...: 100%|████████████████████████████| 1/1 [00:00<00:00,  3.38it/s]
Jaggler3 commented 3 years ago

@zijwang That could simplify things for sure, though the progress bar would still need to be disabled somehow as well. I check for the debug flag and use tqdm or not based on that.

Maybe a check of the logging.root.level could work instead for that?

zijwang commented 3 years ago

@Jaggler3 Yes exactly. Checking the logging level and setting it to WARN (or higher in case WARN didn't work for edge cases) when debug=False would be a good solution.

For tqdm, could you see whether the disable option works?

computermacgyver commented 3 years ago

For tqdm, could you see whether the disable option works?

Nice. I've tested redirecting standard error to a file. Without disable=None the progress bar appears in the file, but with disable=None the file is empty. I guess this would work on storm as well, but haven't checked.

$ python  mwe.py 2>err
$ cat err
$
Jaggler3 commented 3 years ago

Just updated the PR, removed all the checks I added and disables the progress bars if logging.root.level >= logging.WARN. Calling logging.basicConfig(level=logging.WARN) before from m3inference import M3Twitter removes all output.

zijwang commented 3 years ago

Thanks for updating the PR, @Jaggler3 ! The current update seems to only change the tqdm bar but not the logger output. Will this work for your use case (i.e., you will have to change the default logging level outside M3 to suppress logger output)?

Jaggler3 commented 3 years ago

@zijwang Right, if the tqdm bar responds to the logging level I can stop logging for the whole package with that one logging.basicConfig call. This works for my use case.

One note: logging.basicConfig had to be called before m3inference was imported, as logging only accepts the first basicConfig, and the package calls that near the beginning of the m3inference.py and m3twitter.py files.

zijwang commented 3 years ago

Hey @Jaggler3 !

I have created a PR (#18) to support the functionality of disabling logging information and tqdm bars based on your code. Could you check whether the PR works in your use case (especially if you are using M3Twitter -- I currently do not have Twitter auth and was not able to test that component)?

zijwang commented 3 years ago

@Jaggler3 also a reminder that M3 is licensed under AGPL 3.0 (see https://github.com/euagendas/m3inference/blob/master/LICENSE). If you want to use M3 commercially, you will need to open source your complete source code. You will also need to contact us directly as requested in the readme. Feel free to let us know if you have any questions on this matter.