Closed datacubed closed 3 years ago
Hi! Check out our fork where this issue is solved: https://github.com/BingAds/BingAds-Python-SDK/pull/108
Hi! Check out our fork where this issue is solved: #108
Hey this post says "the branch has conflicts" - should I assume to wait for this to be resolved before using?
I hope this can help you.
@qitia from this post do you recommend adding the fix as mentioned: https://github.com/BingAds/BingAds-Python-SDK/issues/80#issuecomment-390359504
In the post the last comment mentions moving away from suds ? Is that something happening with this library? Or will there be an update that handles this cache "bloating" (as ive seen it called) ?
hey @datacubed , what I recommend is try below method, with remove_suds_cache set to true. This will help you clear the cache folder on app's exist.
from os import path
from tempfile import gettempdir
import atexit
def remove_suds_cache(remove_suds_cache=False):
if remove_suds_cache:
shutil.rmtree(path.join(gettempdir(), 'suds'), ignore_errors=True)
pass
# Set remove_suds_cache=True to remove the temp suds cache
atexit.register(remove_suds_cache, remove_suds_cache=False)
I hope this can help.
There's an internal suds.Client
instance in the library code which is used to construct SOAP object factories. It's instantiated with suds' default cache which pollutes the /tmp/suds
directory because a bug in suds' default cache implementation.
Our fork makes this instance use a dummy in-memory cache (basically a dict
), which will load Bing Ads' SOAP XSDs to RAM but won't touch the disk at all. It overwrites the instantiating code, hence the conflicts. You should be fine using it.
I recently ran into this issue so I took a look at the solutions people have posted here. I found them problematic because of various issues, including 1.) deleting the /tmp/suds directory at exit, which I imagine would cause problems if you have lots of programs running simultaneously, and 2.) waiting for #108 to get merged, which might not happen soon.
The core issue is that suds-jurko 0.6 defines a mangle() function in its reader class that is broken in newer versions of python. The old one does this:
h = abs(hash(name))
I believe hash() was changed at some point to return different values for the same name in different processes. The development version of suds-jurko has this instead:
h = md5(name.encode()).hexdigest()
which behaves deterministically for the same name.
I wrote some code to monkeypatch suds 0.6 into behaving correctly:
import suds.reader
try:
from hashlib import md5
except ImportError:
# 'hashlib' package added in Python 2.5 so use the now deprecated/removed
# 'md5' package in older Python versions.
from md5 import md5
def mangle(self, name, x):
"""
Mangle the name by hashing the I{name} and appending I{x}.
@return: The mangled name.
@rtype: str
"""
h = md5(name.encode()).hexdigest()
return '%s-%s' % (h, x)
print("Monkeypatching suds Reader classes")
suds.reader.Reader.mangle = mangle
suds.reader.DocumentReader.mangle = mangle
suds.reader.DefinitionsReader.mangle = mangle
print("Monkeypatching complete, importing bingads")
from bingads import *
(It's entirely possible the above code could be simplified - I just know that what I've written works.)
If you have some library code you can drop this in as bingads.py and then do from \
IMO a better solution than #108 or trying to fix the broken reader code would be to just use a version of suds that's actually maintained, like suds-community. Or switch to zeep. But others have already suggested both these options in other issues.
moved to suds-community since 13.0.11.1
Hi all, just chiming in to say I'm still getting this issue just in different form. I'm using bingads
in an Airflow DAG and if imported anywhere in a file with the @dag
or @task
decorator, it creates a new directory and .px
file in /tmp/
. This is causing our Pods to be evicted with the ever growing files in the /tmp
folder which is a real drag.
Example:
/tmp
/tmpatzruj5fsuds-default-cache
suds-9dbf36d0033995011d074a4d7e7f1669-document.px
version
No idea what's causing this, but it's clearly the suds.cache.py
__init__()
method which is seeing None
for the location and setting the default location.
Using bingads
13.0.13, suds-community
1.1.1 .
Any amount of help would be excellent.
As a follow up to the above, updating to the 13.0.14
version of the SDK fixed the default-cache folder creep issue. Our scheduler is still full of tmp*
directories, but at least they're empty now.
I'm not sure if this an issue or not, so I apology now if I'm in the wrong place. Our server hit us with a "ran out of disk space", and when looking around found the tmp/suds folder being quite a few GBs inside. Once the *.px files were deleted, everything returned to normal.
I know usually /tmp/ is cleared on reboot - but we try not to reboot our servers too often. Is it possible for me to avoid this via a function in the API, or do I just need to write a command that clears the tmp/suds director after we've used the Bing API.
Cheers