Link to info about custom codecs

stuaxo commented 4 years ago

I'm really interested in how this works, wrt custom codecs. Is this activated by the coding part of the header ?

Is it it possible to link to some docs from the README here?

ssize-t commented 4 years ago

Yes, the # coding: comment makes this possible. This feature was meant for defining custom encodings, e.g. translating Python keywords into a new language.

The interface for registering custom codecs exposes the tokenizer output (here), which can be consumed in it's entirety to get the full source file.

Once I have the full source code I use a failure-tolerant parser parso here, which is meant for incremental parsing for e.g. editor integrations, which puts "error" AST nodes where the C functions are.

This is somewhat brittle, but has worked well enough. Ideally, this would be replaced by a custom parser, which finds the @inlinec annotations and maybe even switches to a C parser for the function body. However, this requires context-sensitive parsing which is a little tricky. PRs welcome :-)

Then I traverse the AST, find the nodes decorated with @inlinec, define ctypes wrappers for the function bodies, replace the body of the function with a call to the wrapper and glue the wrapper imports to the top of the file.

Once that's done I re-tokenize and return the new token stream and the Python interpreter only ever sees the transformed version of the source code.

stuaxo commented 4 years ago

Lots of things you can imagine with this, I'll have to put this on my procrastination-list and have a play.

One thing that could be handy is passing the language into the decorator

@inline('c') @inline('nim')

etc... and have the languages plugable somehow.

I wonder if coupling this with ccache could speed things up ?

You could name the generated C source files using some quick hash of the content (xx_hash if you care about speed) - in that way ccache would probably not try and recompile things it already has.

ssize-t commented 4 years ago

Absolutely! I like your line of thinking here :)

ssize-t / inlinec

Link to info about custom codecs #3