Open leonid-butenko opened 2 years ago
Tagging @DennisHeimbigner
It turns out that in some cases, curl mishandles authorization information if it does not have an actual physical cookie file. This may no longer be true, but I have not tested it recently. So, the netcdf-c library will create a cookie file: /tmp/occookieXXXXXX. You should be able to override this and force it to use a specific cookie file as follows:
HTTP.COOKIEJAR=<absolute path name of cookie jar file>
The actual path can be anywhere as long is it can read and written. So /tmp is usually a good place to put it and the actual file should be sufficiently unique that it will not conflict with other files in /tmp.
indeed, ~/.dodsrc approach does actually work. Thanks for the advice! But to me it sounds more like a workaround, doesn't it? How I came to this - I used netcdf to query our dataset over a period of half a year and at some point I came across "no free space on the device" system error and then I found 2.5 million occookies in my /tmp directory.
If by "workaround" you refer to the libcurl conflict between authorization and having a physical cookie file, then yes.
Ok i understand your concern about libcurl, but how about cleaning up the temporary cookie files after the curl interaction is over? Correct me if I'm wrong, but It should be a responsibility of netcdf-c libraray who created them, right?
I agree. I should have done that originally; guess it just slipped thru the cracks.
I did some checking and there is already code to delete the cookie jar file. Now the question is: why is it not being called?
Can that be that it has been implemented in a most recent version of netcdf-c library? I use 4.7.1 version from anaconda: https://anaconda.org/conda-forge/libnetcdf/4.7.1/download/linux-64/libnetcdf-4.7.1-nompi_h94020b1_102.tar.bz2
Possible. I would have to do some searching. In any case, it seems to work with the current master. The other way this can happen is if you stop, say, ncdump before it completes. In that situation, the library would have no chance to delete the cookie file.
The equivalent code appears to be in 4.7.4.
The code is also in 4.7.1
Thanks for the check! Can you suggest which function triggers the cleanup of the cookiejar file? I'm going to implement a kind of post-processing so that it cleans up at the end. Is that available also in python interface ?
I am not sure what you are asking for. If you close the file, then the cookie file should be reclaimed. If you are not calling nc_close, then you need the name of the cookie file so you can delete it somewhere in your code, correct?
What I mean is that the nc_close, as described above, not actually causing a deletion of cookiejar in my case. Taking into the account your comment from above, the code for deletion should be in place in the version 4.7.1. I could call the cleanup function directly to make 100% sure the cookie is removed when my script is finished. Does it make sense?
The relevant sequence of calls is this:
NC_authclear libdispatch/dauth.c:176
occlose oc2/ocinternal.c:274
oc_close oc2/oc.c:85
NCD2_close libdap2/ncd2dispatch.c:1806
nc_close libdispatch/dfile.c:1291
but I confess I cannot see why this sequence is failing assuming you call nc_close. BTW, if the DAP2 url you are using is public, can you send it to me?
Ok, thank you! I think I'm now more or less clear on what's going on. You're absolutely right on the fact that if I call nc_close, the cookiejar is properly removed in my version 4.7.1. In my python script I have to explicitly call nc.close() before exit, otherwise nc_close is not triggered and cookiejars will remain. My strong belief was that python should handle that automatically .. my bad!
Same with ncdump: usually I pipe ncdump's output to less and if I don't scroll to the very bottom of the output but press 'q' somewhere in the middle, the temporary cookliejar stay. My guess is that ncdump could handle system signals better, and in case of SIGINT would try to cleanup. Pls advise..
Unfortunately I don't have a public Thredds server, only internal one. I still use the version 5.0.0b6.
We have avoided being signal sensitive because it can interfere with programs and libraries that use netcdf-c. I think your best bet is to stick with setting .dodsrc.
I have just realised that the library is not cleaning up the occookie files after connecting to thredds server. I'm quering quite big dataset and every call to the thredds server creates one or several zero-sized cookie file. I assume this is a bug. Pls advise...
Environment Information
Summary of Issue
The cookie file creation happens around here:
https://github.com/Unidata/netcdf-c/blob/main/oc2/ocinternal.c#L567
and never gets removed.
Steps to reproduce the behavior
Try to open a netcdf dataset by connecting to thredds server (for example with ncdump). And you will find this: