Closed sjiang95 closed 2 months ago
Thanks for the suggestion of improvement, @sjiang95 ! If you would like to add this feature and a test in a draft PR, it would be awesome!
Few notes in case you would like to send a PR:
encoding: Optional[str] = None, # add this optional arg
as the last argument
def setup_logger(
name: Optional[str] = "ignite",
level: int = logging.INFO,
stream: Optional[TextIO] = None,
format: str = "%(asctime)s %(name)s %(levelname)s: %(message)s",
filepath: Optional[str] = None,
distributed_rank: Optional[int] = None,
reset: bool = False,
encoding: Optional[str] = None, # add this optional arg
) -> logging.Logger:
encoding="utf-8"
by default, such that users should not think about that and get correct logging by default ?Hello @vfdev-5 ,
thanks for your response.
I also want to set encoding="utf-8"
by default.
@sjiang95 by the way, what is your OS, windows ?
By default, logging FileHandler is using default encoding as in open
: https://docs.python.org/3/library/logging.handlers.html#filehandler
which is
In text mode, if encoding is not specified the encoding used is platform-dependent: locale.getencoding() is called to get the current locale encoding.
Source: https://docs.python.org/3/library/functions.html#open
and
On Android and VxWorks, return "utf-8". On Unix, return the encoding of the current LC_CTYPE locale. Return "utf-8" if nl_langinfo(CODESET) returns an empty string: for example, if the current LC_CTYPE locale is not supported. On Windows, return the ANSI code page.
Testing on the linux your PR, even with encoding=None we have already utf-8 encoding:
fp = dirname / "log"
logger = setup_logger(name="logger", filepath=fp, encoding=None)
logger.info("ä½ å¥½") # ni hao
with open(fp, "r") as h:
data = h.readlines()
assert "ä½ å¥½" in data[0]
so, your PR will fix this for other OS and setups...
@vfdev-5 yes I encountered this problem on windows.
How do you think we should deal with this? How about importing platform
and using platform.system()
to help limiting the behaviour on only windows?
We can use encoding="utf-8" by default and in the tests check the platform as you are suggesting.
🚀 Feature
Add optional arg
encoding = None
forignite.utils.setup_logger
Motivation
I encountered garbled text when loggering messages that contains CJK characters using
logger.info()
. For example,In the .log file, garbled text is printed.
Solution
This can be addressed by simply passing
encoding = "utf-8"
to the filehandlerhttps://github.com/pytorch/ignite/blob/f431e60b09743dc8d99b7e5f32e234f46a2a920d/ignite/utils.py#L268
Looking forward to a discussion about this before I create a PR.