NatLibFi / Annif

Annif is a multi-algorithm automated subject indexing tool for libraries, archives and museums.
https://annif.org
Other
190 stars 41 forks source link

Suppress TensorFlow info messages to debug level #721

Closed juhoinkinen closed 11 months ago

juhoinkinen commented 12 months ago

Makes TensorFlow log level to be set by Annif log level (--verbosity/-v option of CLI commands) using TF_CPP_MIN_LOG_LEVEL env variable.

The INFO level of TensorFlow is mapped to be used only on DEBUG level of Annif to hide the distracting INFO messages by TensorFlow. Closes #720. Based on this SO answer.

The env setting is in the annif/__init__.py to make the log level change work also when TF is used by spaCy analyzer (which was not apparent at first), in addition to TF usage by NN ensemble backend.

I am not seeing the second message mentioned in the issue:

2023-07-25 11:31:44.032688: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.

I hope also this message arises during TF import, so this PR should fix that part of the issue too (Edit: no, it did not, but the current approach should take care it). Could @osma verify this?

codecov[bot] commented 12 months ago

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (02233ab) 99.66% compared to head (80c0fe2) 99.67%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #721 +/- ## ======================================= Coverage 99.66% 99.67% ======================================= Files 89 89 Lines 6356 6380 +24 ======================================= + Hits 6335 6359 +24 Misses 21 21 ``` | [Files Changed](https://app.codecov.io/gh/NatLibFi/Annif/pull/721?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=NatLibFi) | Coverage Δ | | |---|---|---| | [annif/\_\_init\_\_.py](https://app.codecov.io/gh/NatLibFi/Annif/pull/721?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=NatLibFi#diff-YW5uaWYvX19pbml0X18ucHk=) | `100.00% <100.00%> (ø)` | | | [tests/test\_cli.py](https://app.codecov.io/gh/NatLibFi/Annif/pull/721?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=NatLibFi#diff-dGVzdHMvdGVzdF9jbGkucHk=) | `100.00% <100.00%> (ø)` | |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

osma commented 12 months ago

I tested this, but it didn't seem to do anything. After merging this PR branch to my local main branch, I'm seeing this:

(annif-py3.10) ~/git/Annif$ annif eval fasttext-spacy tests/corpora/archaeology/fulltext/
2023-07-31 14:53:46.253944: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-07-31 14:53:46.327325: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.

This is exactly the same messages I'm seeing without this PR applied.

juhoinkinen commented 12 months ago

I tested this, but it didn't seem to do anything. After merging this PR branch to my local main branch, I'm seeing this:

(annif-py3.10) ~/git/Annif$ annif eval fasttext-spacy tests/corpora/archaeology/fulltext/
2023-07-31 14:53:46.253944: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-07-31 14:53:46.327325: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.

This is exactly the same messages I'm seeing without this PR applied.

Hmh, I had tested with just annif list-projects command, which just imported TF. But the message arise also on using TF, triggered by annif suggest and annif eval etc. commands.

Now the TF log level is kept at 1 also after imports. However, the level can be controlled by setting the env in commandline:


TF_CPP_MIN_LOG_LEVEL=0 annif eval nn-ensemble-fi tests/corpora/archaeology/fulltext/
2023-07-31 16:05:09.258908: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
osma commented 12 months ago

This didn't help either. I'm still getting the same messages.

I added a bit of debugging and it seems to me that the code you've inserted is not being run at all when I run the annif eval command above. Note that the command is not attempting to use an NN ensemble project, and I don't even have any such projects configured.

I also tried commenting out all the tensorflow/keras imports annif/backend/nn_ensemble.py - even that didn't help. It seems to me that tensorflow is being indirectly imported by something outside the annif codebase.

A-ha! It seems that importing spaCy also imports TensorFlow if it has been installed:

$ python
Python 3.10.6 (main, May 29 2023, 11:10:38) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import spacy
2023-07-31 17:26:16.049062: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-07-31 17:26:16.129118: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
juhoinkinen commented 12 months ago

Well this should work even for TensorFlow usage by spaCy.

osma commented 12 months ago

One question though: would it make sense to set the TF logging level based on the Annif (logging module) log level?

juhoinkinen commented 12 months ago

One question though: would it make sense to set the TF logging level based on the Annif (logging module) log level?

It would be nice, but then the question is how the levels should be aligned: the INFO messages of TF are the ones we want to hide, while at the same time show Annif INFO messages(?).

One way for this would be to make TF log level "one more" than Annif log level (while Annif shows INFO+WARN+ERROR messages, TF shows WARN+ERROR messages; while Annif shows DEBUG+INFO+WARN+ERROR messages, TF shows INFO+WARN+ERROR).

osma commented 12 months ago

The "one more" idea sounds very good! I think it's reasonable to "tone down" log messages from TF by one level. They may sometimes be relevant for debugging problems, so making it possible (but not the default) to see messages from TF would be a nice feature.

juhoinkinen commented 12 months ago

Now the feature works for regular commands: TensorFlow INFO messages are shown only when running Annif at DEBUG log level.

annif list-projects -v DEBUG
debug: creating flask app with configuration annif.default_config.Config
debug: Reading configuration file projects.cfg in CFG format
2023-08-02 10:50:42.339259: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

But I could not make it work in API server mode, I mean with annif run or run via gunicorn. I think it should have worked, according to a SO answer. So there might be something out of tune in the overall logging functionality.

juhoinkinen commented 12 months ago

The test for this needs to fixed... I'll try something with @mock.patch.dict(os.environ, ...) decorator.

sonarcloud[bot] commented 11 months ago

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication