Open jlstevens opened 10 years ago
We've had problems in the past with weave not expecting multiple simultaneous compilations, and although our particular problem was addressed there appear to be at least two subsequent bug reports on scipy related to simultaneous compilations: https://github.com/scipy/scipy/issues/1895 https://github.com/scipy/scipy/issues/2902
What's the status for Cython on DICE? Maybe we can finally switch over to that implementation.
My temporary hack is to force a delay between each call to subprocess.Popen
in Lancet. It is ugly and should be unnecessary but maybe this will no longer be an issue once we switch over to Cython...
What's the status for Cython on DICE? Maybe we can finally switch over to that implementation.
There's a trial environment:
[14-10-12-20:39][hebb]: dice-pyenv CNV
You have now entered the CNV python environment.
Most python-based commands will work differently while you can
see "(CNV)" in your prompt. You can leave the environment
by closing this shell (for example by typing "exit").
(CNV)[14-10-12-20:39][hebb]: python
Python 2.7.8 (default, Sep 24 2014, 13:04:01)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import cython
>>> cython.__version__
'0.21'
I'm happy to test and/or work on your cython stuff, if that's possible?
That's great news. The cython implementation is a complete drop in replacement for the old weave based code. It's currently in topo.optimized. So all you should have to do is import your response and learning functions from there. Jim did want me to rename it from topo.optimized to topo.cythonopt, so if we're actually going to make the switch I'll rename it tomorrow.
Personally, I am against renaming optimized
. In my opinion, we should simply move over to Cython and discourage/replace all the weave stuff.
Just tried working with the Cython implementation and found out the functions don't seem to pickle correctly. Will have to look into it and get back to you.
I too have noticed problems when launching simultaneous processes with weave (most recently when benchmarking trout.inf). I think that my workaround was to launch a single process running the same code, letting that run until after weave was done generating its code files, and only then launching multiple simultaneous processes. Obviously that's clunky and error-prone, though.
It would be good to use Cython only. However, such a directory can't be called topo.optimized, because the design currently requires that only Cython-related code can ever go into that directory, which is not only misleading right now (while there is weave-optimized code elsewhere in Topographica), but is also misleading for the future (as Cython is very unlikely to be the last word on optimization in Python). If topo.optimized were to be the home for all such future optimizations, that name would be fine, but instead simply importing topo.optimized pulls in a dependency on Cython, which is not appropriate for a generic container of all types of optimization. So topo.optimized must be named something that indicates clearly that that it is not a container for all optimizations, but simply those that use Cython, regardless of the status of weave. Just like topo.tkgui is not all possible GUI code, but only the Tk version (with a topo.qtgui a possibility someday), topo.cythonopt (or whatever less ugly name someone can suggest) is only the Cython-optimized code, not PyPy code, numexpr code, or CUDA code. Topographica users care a lot about optimization, and Cython is only the best solution we have right now, not the best we'll ever have!
Ok. Why not call it topo.pyx
if the folder is mostly about interfacing with Cython's .pyx
files? Anyone who knows anything at all about Cython will know that this directory contains cython optimizations.
I don't like naming directories using periods, particularly not when there is a recognizable file extension after the period (very confusing!). We don't even have any with underscores, so topo_pyx
would look strange. Yet topopyx is hard to parse. What about just pyx
? It's already in topo/, so topo is redundant anyway.
Just pyx
is what I meant - hence my suggestion topo.pyx
(the pyx
directory would live in topo
)
Alright, I'll rename it to topo.pyx and try to fix the snapshotting issues tomorrow and then hopefully we can switch all the models over to cython.
Ah, topo.pyx is fine then.
To get back to the original reason for this issue - do we think that Cython will work better than weave in this instance?
I assume is the .so
file exists already, nothing will be compiled. However, if no .so
file exists and multiple instances are launched at once, won't there be a bunch of cython process trying to compile the same file at the same time?
Yes, your original question has been hijacked :) I don't know exactly what the problem is you're seeing while using weave (though it seems likely to be multiple simultaneous compilations), and I agree that if multiple cython instances simultaneously try to compile the same file there could be a problem...but to me, it doesn't seem worthwhile trying to address the problem specifically for weave. I guess Philipp will have a better idea about cython, but maybe we should just wait and see at what point (in the cython processing chain) the problem occurs.
I think I tried a variety of things in the past, like set my HOME environment variable (or whatever environment variable(s) controlled the weave compilation and intermediate folders - I can't remember) to point to a folder on the local disk (if one topographica instance per machine), or to be different for each instance (if multiple instances per machine). Or like Jim said, I'd do something to make sure the compilation had already happened before launching my desired simulations (you just have to remember what things might trigger a re-compilation, e.g. you might change a python data type and get a recompilation).
any updates on this? has using cython instead of weave been helpful? have any of @ceball's recommendations for getting around the weave issue proven useful?
ceball's suggestions do usually work for this. I don't think weave has fixed this problem, and as no one is working on weave that I know of, I doubt it will be fixed. Topographica does have cython versions of its weave components, but they are significantly slower, unfortunately.
Topographica does have cython versions of its weave components, but they are significantly slower, unfortunately.
Is this true? When I implemented and profiled them, they came out slightly faster if anything. Are you talking about the sparse implementation, because we also have a Cython wrapper around the weave implementation, which is what I'm talking about. We just haven't gotten around to setting up the setup.py correctly for that implementation to work properly.
We really should switch to cython as default, leaving weave as a fallback or for the occasional optional thing that still need it (e.g some of the color related stuff). Eventually, we need to get rid of weave anyway as it is essentially a dead project now.
I'm talking about the sparse components, which is what's usable right now and has been fully benchmarked by Ignotas. It would be great to make the dense Cython components ready for production use, and to similarly document their performance. And if it's really as fast, then getting rid of weave would be great!
I think we should have Cython working in setup.py correctly to make it easy to use first. That way, it will be easy to use and also make it easier to convince you that it is production ready. :-)
thanks for the reply, @jbednar. i encountered the same issue you all did when i was using weave
in brian2
. switching to cython
solved the problem for me, as it does not have this race condition.
Using Lancet, I am launching four concurrent topographica instances at a time. Sporadically this fails when the optimized components are being compiled - I get some sort of complaint regarding a truncated
.o
file in weave.I do not have the exact error message to hand as my last attempt was successful, with all four instances running fine - as I mentioned, this problem is sporadic.