end2endzone / RapidAssist

RapidAssist is a lite cross-platform library that assist you with the most c++ repetitive tasks.
MIT License
5 stars 0 forks source link

Exception when calling ra::environment::GetEnvironmentVariablesUtf8() #54

Closed end2endzone closed 3 years ago

end2endzone commented 4 years ago

On Windows, in a non-unicode program (a.k.a Multi-Byte Character Set), calling ra::environment::GetEnvironmentVariablesUtf8() before ra::environment::SetEnvironmentVariableUtf8() or ra::environment::GetEnvironmentVariableUtf8() result in an exception.

The problems is described in the following article https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/getenv-wgetenv?view=vs-2019 :

In an MBCS program (for example, in an SBCS ASCII program), _wenviron is initially NULL because the environment is composed of multibyte-character strings. Then, on the first call to _wputenv, or on the first call to _wgetenv if an (MBCS) environment already exists, a corresponding wide-character string environment is created and is then pointed to by _wenviron.

This is because the variable _wenviron is not initialized.

Note: the opposite bug also exists in a unicode program when calling ra::environment::GetEnvironmentVariables() before ra::environment::SetEnvironmentVariable() or ra::environment::GetEnvironmentVariable().

end2endzone commented 4 years ago

A good workaround is to call ra::environment::GetEnvironmentVariableUtf8("foobar"); before the actual call to ra::environment::GetEnvironmentVariablesUtf8().

end2endzone commented 4 years ago

A proposed solution is to reimplement ra::environment::GetEnvironmentVariablesUtf8() like explained in https://stackoverflow.com/questions/9535112/get-all-env-variables-in-c-c-on-windows :

typedef std::basic_string<TCHAR> tstring; // Generally convenient
typedef std::map<tstring, tstring> environment_t;
environment_t get_env() {
    environment_t env;
    auto free = [](LPTCH p) { FreeEnvironmentStrings(p); };
    auto env_block = std::unique_ptr<TCHAR, decltype(free)>{
            GetEnvironmentStrings(), free};
    for (LPTCH i = env_block.get(); *i != T('\0'); ++i) {
        tstring key;
        tstring value;
        for (; *i != T('='); ++i)
            key += *i;
        ++i;
        for (; *i != T('\0'); ++i)
            value += *i;
        env[key] = value;
    }
    return env;
}

or something similar...

end2endzone commented 4 years ago

Actual implementation should be as follow :

StringVector GetEnvironmentVariablesUtf8() {
  StringVector vars;

  // Get a pointer to the environment block.
  LPWCH lpvEnv = GetEnvironmentStringsW();

  // If the returned pointer is NULL, exit.
  if (lpvEnv == NULL)
    return vars;

  // Variable strings are separated by NULL byte, and the block is terminated by a NULL byte. 
  LPWSTR lpvTmp = (LPWSTR)lpvEnv;
  while (*lpvTmp)
  {
    std::wstring definition = lpvTmp;

    // Skip "current directory" per drive environment variables:
    //  "=::=::\"
    //  "=C:=C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE"
    //  "=G:=G:\Temp\temp3"
    if (!definition.empty() && definition[0] != '=')
    {
      size_t offset = definition.find('=');
      if (offset != std::string::npos) {
        std::wstring nameW = definition.substr(0, offset);
        //std::wstring valueW = definition.substr(offset + 1);

        std::string name_utf8  = ra::unicode::UnicodeToUtf8(nameW);
        //std::string value_utf8 = ra::unicode::UnicodeToUtf8(valueW);

        vars.push_back(name_utf8);
      }
    }

    //next definition
    lpvTmp += lstrlenW(lpvTmp) + 1;
  }
  FreeEnvironmentStringsW(lpvEnv);

  return vars;
}