ioam / topographica

A general-purpose neural simulator focusing on topographic maps.
topographica.org
BSD 3-Clause "New" or "Revised" License
53 stars 32 forks source link

Building the Cython extension correctly #599

Open philippjfr opened 9 years ago

philippjfr commented 9 years ago

I had a real nightmare trying to get the Cython code to pickle properly today and still haven't found a solution that doesn't involve putting a setup script in the root folder. This is because the distutils Extension requires you to declare that it is part of topo.pyx. If you don't the __module__ attribute isn't set correctly and pickle cannot locate the code. See this stackoverflow thread for reference.

I'd suggest placing the extension building code in the topographica script, so we don't have to introduce another odd file at the top-level. It would only get executed whenever the Cython or C code is changed. You can see roughly what this code looks like in topo/optimized/compile.py.

In my search for answers I also came across some common practices for Cython distributions and it is strongly recommended to ship the generated .c file with the package. This would eliminate our dependency on Cython althogether but come at the cost of 8941 lines of terrible auto-generated code. Not sure it's worth it but I thought I would at least mention that it's an option.

jbednar commented 9 years ago

I very much like our simple topographica script, and if anything is to be added to it, I'd want it to be as few lines as possible, e.g. importing something from elsewhere in topo, and adding no dependencies. I don't like the toplevel setup script; not sure why Cython needs anything to do with distutils, but it sounds as hacky as Weave :-(. Shipping the C code with a PyPi package is fine by me, but won't help our Git installations. I.e., even if we checked in the C code into git, we'd still need Cython to be able to update that code, so in practice any Git version would still have a dependency on Cython.

philippjfr commented 9 years ago

Distutils is the preferred way of building C extensions, so I wouldn't call it hacky. However I do agree that it's rather annoying that Extensions have to be built from the top-level setup.py file. We could consider adding a top-level setup.py, which accepts args to either build just the Cython extension(s) or wrap the platform/setup.py file. I'm not hugely familiar with distutils, so does anyone know of any better options?

Agreed, obviously updating the code will require Cython, so shipping it with git might be overkill but distributing it with the PyPI package sounds like a good idea to me.

jbednar commented 9 years ago

Distutils is definitely a hack, even if it's the best we've got. Read the numpy discussion mailing list archive over the past ten years to see how much work they've had to do to work around its many failings.

jlstevens commented 9 years ago

Just to say I am against distributing the auto-generated C code. Cython is a standard dependency, pip installs in seconds and I have never, ever had problems getting Cython working quickly.

The only caveat is that this applies to sane systems where I can use virtualenv but now that even DICE has Cython, I don't think this is a problem. Frankly, I would never consider using topographica (or Python for that matter) without also using virtualenv...

ceball commented 9 years ago

I don't like the toplevel setup script;

I don't remember seeing a python package without a top-level setup script recently (in the source code repository, not just the distributed package). Not saying that makes me like it, just that I think it wouldn't be too bad. I haven't yet investigated why it's necessity for pickling (I haven't yet tested pickling since starting to try out topographica's cython stuff), but it does sound like cython expects a top-level setup.py file: https://github.com/cython/cython/wiki/FAQ#id51

not sure why Cython needs anything to do with distutils,

How else would it get the right compiler info - or am I missing something?

jlstevens commented 9 years ago

I am in favor of a working setup script at the top-level, especially if it installs all the dependencies correctly while also allowing us to switch to Cython. The main thing is to have it work with pipcorrectly so that Topographica installation becomes easier.

My main concern is that as the Topographica repository evolves, the PyPI versions of the subprojects no longer work together properly. The best scenario would be to have setup.py install dependencies via git when using the Topographica git repository and have it use PyPI when Topographica is installed from PyPI. When installing dependencies via git, it should also respect the appropriate submodule references - perhaps by parsing the .gitmodules file to get the appropriate SHAs?

jbednar commented 9 years ago

A setup script at the top level is fine, I guess; all the other IOAM projects have one...

philippjfr commented 8 years ago

Not sure if anyone is still watching this but I've just added a top-level setup.py, which will correctly compile the Cython based optimized components. @ceball What would it take to get rid of platform/distutils completely, can I just deleted now that I've added a top-level setup.py?