Open matthew-wozniczka opened 1 year ago
On 12/01/2023 23:38, Matthew Wozniczka wrote:
If you dlopen() the unixODBC ODBCInst library (built w/ --enable-inicaching=true), call SQLGetPrivateProfileString(), and then unload it, it will leak the ini cache.
I noticed that there was a |__clear_ini_cache| function added a while back which could be called to prevent the leak, but it's not exported.
Could you perhaps add an |atexit| call somewhere to clean up the cache before the library is unloaded?
That may work on those platforms that offer that. Have you tried it? I guess you would call it on the initial AllocEnv, but not sure if calling it multiple times would cause a problem, or how it would play with threads.
On 12/01/2023 23:38, Matthew Wozniczka wrote:
If you dlopen() the unixODBC ODBCInst library (built w/ --enable-inicaching=true), call SQLGetPrivateProfileString(), and then unload it, it will leak the ini cache.
I noticed that there was a |__clear_ini_cache| function added a while back which could be called to prevent the leak, but it's not exported.
Could you perhaps add an |atexit| call somewhere to clean up the cache before the library is unloaded?
Ok, I have added a check for the function and a call to it when the environment is initialized. See if that does what you are looking for.
I'm a bit surprised you can call __clear_ini_cache
from SQLAllocHandle
; I guess the 'odbcinst' code is statically compiled into libodbc? My use case is loading/unloading libodbcinst directly; I assumed that libodbc dynamically linked against libodbcinst.
In any case, I don't think your change would help in my scenario. Wouldn't it make more sense to register the atexit
handler from within the 'odbcinst' code so that it would get used in both scenarios (libodbc & libodbcinst)?
Also, I noticed you were calling atexit
on every call to SQLAllocHandle
; atexit
isn't idempotent, so that means the handler will get called once for every call to SQLAllocHandle
. That in itself isn't too much of a problem (a bit inefficient though), but the number of handlers that atexit
can register can be limited, so you could 'waste' it for other libraries or the main application.
Another suggestion would be to check for atexit
failing, and to then fail whatever function triggered it, so that you don't set the process up for a leak later on.
And one more thing, even though I suggested using atexit
, IIRC there's some unix platforms where it doesn't really 'work' for shared libraries (it doesn't run the handlers registered from a library when it's unloaded, rather it waits until the process is actually exiting, at which point the library might have already been unloading, causing a segfault), so you might want to be more selective than calling it whenever it exists. (There might be some alternative to atexit
which would work on those platforms, but I'm not sure of the details)
On 13/01/2023 18:21, Matthew Wozniczka wrote:
I'm a bit surprised you can call |__clear_ini_cache| from |SQLAllocHandle|; I guess the 'odbcinst' code is statically compiled into libodbc? My use case is loading/unloading libodbcinst directly; I assumed that libodbc dynamically linked against libodbcinst.
In any case, I don't think your change would help in my scenario. Wouldn't it make more sense to register the |atexit| handler from within the 'odbcinst' code so that it would get used in both scenarios (libodbc & libodbcinst)?
So where would you suggest I call atexit() from and when?
Also, I noticed you were calling |atexit| on every call to |SQLAllocHandle|; |atexit| isn't idempotent, so that means the handler will get called once for every call to |SQLAllocHandle|. That in itself isn't too much of a problem (a bit inefficient though), but the number of handlers that |atexit| can register can be limited, so you could 'waste' it for other libraries or the main application.
It only calls it on the allocation of the ENVIRONMENT Handle, so not used that often.
Another suggestion would be to check for |atexit| failing, and to then fail whatever function triggered it, so that you don't set the process up for a leak later on.
And one more thing, even though I suggested using |atexit|, IIRC there's some unix platforms where it doesn't really 'work' for shared libraries (it doesn't run the handlers registered from a library when it's unloaded, rather it waits until the process is actually exiting, at which point the library might have already been unloading, causing a segfault), so you might want to be more selective than calling it whenever it exists. (There might be some alternative to |atexit| which would work on those platforms, but I'm not sure of the details)
TBH, all you have done is made me think I should unwind the changes and suggest that you either build your own version that does exactly what you want or just build without the ini pooling enabled.
So where would you suggest I call atexit() from and when?
I think the simplest thing, without other refactoring, would be to call it on the first call to _check_ini_cache
It only calls it on the allocation of the ENVIRONMENT Handle, so not used that often.
In a long-lived process, anything more often than 'once' is often enough to cause problems.
TBH, all you have done is made me think I should unwind the changes and suggest that you either build your own version that does exactly what you want or just build without the ini pooling enabled.
I work on a framework used to build ODBC drivers which are driver-manager-agnostic, and the reason we load libodbcinst in the first place is so that we follow the DM-in-use's behaviour for finding configuration information. So we're not really in the place to build the library ourselves (which one to load is configured by the user of the built driver), which was why I was hoping to fix the issue upstream so that eventually when users upgrade they won't have the leak anymore. (We learned about this leak due to a customer support issue)
We can recommend that customers build with ini-caching disabled, but it's enabled by default, and most customers just use whatever is installed on their machine by the OS rather than building unixODBC themselves.
If by 'build your own version' you mean come up with a pull request for you, I'd be willing to do that.
On 13/01/2023 19:22, Matthew Wozniczka wrote:
So where would you suggest I call atexit() from and when?
I think the simplest thing, without other refactoring, would be to call it on the first call to |_check_ini_cache|
|Ok, but if its unreliable when used in shared object exactly what is gained by making it generic? I guess it could have a configure test that would create and load a library and check if atexit was called, but it all seems very iffy.|
It only calls it on the allocation of the ENVIRONMENT Handle, so not used that often.
In a long-lived process, anything more often than 'once' is often enough to cause problems.
Yes, I would agree with that, but given a finite ini file then the leaked memory for a long lived process is not going to increase that much either.
TBH, all you have done is made me think I should unwind the changes and suggest that you either build your own version that does exactly what you want or just build without the ini pooling enabled.
I work on a framework used to build ODBC drivers which are driver-manager-agnostic, and the reason we load libodbcinst in the first place is so that we follow the DM-in-use's behaviour for finding configuration information. So we're not really in the place to build the library ourselves (which one to load is configured by the user of the built driver), which was why I was hoping to fix the issue upstream so that eventually when users upgrade they won't have the leak anymore. (We learned about this leak due to a customer support issue)
Ok. with as much sympathy as I can muster at 22:49 on a Friday night, that sounds a bit like a "you problem". the solution is available in the standard build by not enabling the pooling. It might be simpler to just open the ini files yourself?
We can recommend that customers build with ini-caching disabled, but it's enabled by default, and most customers just use whatever is installed on their machine by the OS rather than building unixODBC themselves.
If by 'build your own version' you mean come up with a pull request for you, I'd be willing to do that.
Happy to accept that as long as there is zero negative impact for users that are not your customers.
--
Nick
Ok, but if its unreliable when used in shared object exactly what is gained by making it generic? I guess it could have a configure test that would create and load a library and check if atexit was called, but it all seems very iffy.
I'm thinking it makes sense to use it on platforms where atexit
has useful behaviour for shared libraries. I can do a bit of research to figure out which platforms these are, I think enabling it at least on the common platforms where it works would be a win.
Maybe an exported function could be added to the ODBCInst library that explicitly clears the cache, to be called explicitly by whoever loads it, to handle other platforms? This can kind of already be done by calling SQLWriteDSNToIni
/SQLWritePrivateProfileString
, since they call _check_ini_cache
, but we don't want to actually modify any configuration from within a driver simply reading from it.
Yes, I would agree with that, but given a finite ini file then the leaked memory for a long lived process is not going to increase that much either.
The size of the leak isn't limited to the size of the data cached for any given ini file, because the leak is repeated for every load/unload cycle. In our code, we already take care to load it only once to limit the damage [originally we did so for every connection made], but the driver our code lives in may itself be loaded/unloaded an arbitrary number of times in a given process.
Ok. with as much sympathy as I can muster at 22:49 on a Friday night, that sounds a bit like a "you problem". the solution is available in the standard build by not enabling the pooling. It might be simpler to just open the ini files yourself?
I think there's value in general (not just me/my company/our customers) in preventing the leak (at least saves people time in investigating it when it's detected). For me, making the ini caching disabled by default would be a good second-best resolution (compared to fixing the leak) since then it wouldn't affect the vast majority of users (but I worry a bit about how much CPU/IO could be potentially wasted without it).
I'm not going to demand a fix, worst-case we might start loading the library from a subprocess (or re-implement/fork it like you suggested), but I think it makes sense.
Happy to accept that as long as there is zero negative impact for users that are not your customers.
OK, I'll take a look into this soon and try to come up with something that shouldn't break anyone.
I haven't had time to work on this yet, but what do you think about my suggestion to add an exported function to the ODBCInst library that explicitly clears the cache? Given that atexit
won't be portable, I think that's the only way to fully solve the problem?
On 27/01/2023 05:01, Matthew Wozniczka wrote:
I haven't had time to work on this yet, but what do you think about my suggestion to add an exported function to the ODBCInst library that explicitly clears the cache? Given that |atexit| won't be portable, I think that's the only way to fully solve the problem?
— Reply to this email directly, view it on GitHub https://github.com/lurcher/unixODBC/issues/130#issuecomment-1406023951, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABYK62K4FOSE7EQHAONNOUDWUNJD7ANCNFSM6AAAAAATZZX7LM. You are receiving this because you commented.Message ID: @.***>
I have added __clear_ini_cache() to the functions exported from libodbcinst I think that should do what you need.
I just built unixODBC off of head on linux & tested w/ a change to call __clear_ini_cache
, and it seems to resolve that leak, thanks.
I have two more comments/suggestions related to this:
uodbc_clear_ini_cache
(identifiers that begin with two underscores are reserved in C, and in our use case we're not really sure who built the ODBCInst library, so right now we're just blindly calling __clear_ini_cache
if dlsym
finds it, which could be a problem if some other DM defined it differently [Not that worried about this but a coworker brought that up])
If you
dlopen
the unixODBC ODBCInst library (built w/--enable-inicaching=true
), callSQLGetPrivateProfileString
, and then unload it, it will leak the ini cache.I noticed that there was a
__clear_ini_cache
function added a while back which could be called to prevent the leak, but it's not exported.Could you perhaps add an
atexit
call somewhere to clean up the cache before the library is unloaded?