jonhadfield / python-hosts

a hosts file manager library written in python
MIT License
125 stars 33 forks source link

Opening the hosts file failed because it contains GBK encoded characters. #50

Open ricardolee opened 4 months ago

ricardolee commented 4 months ago

image The windows system is a Chinese version, can it be detected? Open the file using GBK encoding。

ricardolee commented 4 months ago

https://github.com/dotnet/runtime/issues/67229 dot net 6 platform. change UTF-8 to UTF-8 BOM

jonhadfield commented 4 months ago

I've spent some time looking at this and have a simple solution that works in Python 3, but to support Python 2 also adds a lot of complexity.
I need to drop Python 2 support at some point, so will create a separate "3" release in the coming weeks/months that will incorporate this change.

ricardolee commented 3 months ago

Currently, I check for UTF-8 BOM before usage and convert to UTF-8 format if necessary. I believe this issue is caused by other software. Of course, it would be better if it could be compatible with UTF-8 BOM.

def change_hosts_encoding_to_utf8(host_path:str) -> bool:
    """
    Convert hosts file encoding to UTF-8.
    """

    raw = open(host_path, 'rb').read()
    if raw.startswith(codecs.BOM_UTF8):
        data = None
        with io.open(host_path, "r", encoding='utf-8-sig') as hosts:
            data = hosts.read()

        if data:
            with io.open(host_path, "w", encoding='utf-8') as hosts:
                hosts.write(data)
                logger.info(f"convert hosts file utf-8-sig to utf-8 success")
                return True
    return False