gogins / csound-extended

Extensions for Csound including algorithmic composition, Android app, and WebAssembly.
GNU Lesser General Public License v2.1
40 stars 1 forks source link

Embed Clang C++ Just-In-Time compiler in a Csound opcode #189

Closed gogins closed 2 years ago

gogins commented 2 years ago

This would be similar to the "faustgen" opcodes, but would compile, link, load, and bind heredoc C/C++ code.

Examples and documentation are scarce though one might think clang/llvm are designed to facilitate this. This might possibly provide a starting point: https://stackoverflow.com/questions/23058873/embed-c-compiler-in-application.

So might this: https://www.qemu.org/.

https://weliveindetail.github.io/blog/post/2017/07/25/compile-with-clang-at-runtime-simple.html.

This one's from the clang repository and is more up to date: https://github.com/llvm/llvm-project/tree/main/clang/examples/clang-interpreter. I will try this.

CMake wouldn't build it, a shell script would. The clang-interpreter worked, and can serve as a proof of concept.

Notably, symbols were loaded from the runtime-compiled module directly, and could immediately be called.

This is remarkably powerful. It should facilitate stuffing whatever the hell you want into Csound at composition time.

gogins commented 2 years ago

The downside of this approach is that a lot of stuff from LLVM and Clang needs to be installed, and CMake build support for out of tree builds is not obvious and did not work for me.

I think just one opcode. Code compiled by this opcode will be reponsible for creating opcodes, instruments, or whatever in its int csound_main(CSOUND *csound) entry point.

Some considerable thought should be given to making everything simple and, even more importantly, obvious.

I do wonder if using gcc to compile Csound and clang to compile modules loaded by Csound will cause problems. A bit of googling just says "maybe" if all modules are linked -stdlib=libstdc++. The function types should be compatible.

gogins commented 2 years ago

My proof of concept works without iostreams, does not work with iostreams:

 "/usr/bin/ld" -z relro --hash-style=gnu --build-id --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o main /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crt1.o /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crti.o /usr/bin/../lib/gcc/x86_64-linux-gnu/9/crtbegin.o -L/usr/lib/llvm-13/lib -L/usr/bin/../lib/gcc/x86_64-linux-gnu/9 -L/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu -L/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/usr/lib/x86_64-linux-gnu/../../lib64 -L/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../.. -L/usr/lib/llvm-10/bin/../lib -L/lib -L/usr/lib -g -export-dynamic /tmp/clang_opcode-4725a6.o -lclangTooling -lclangFrontendTool -lclangFrontend -lclangDriver -lclangSerialization -lclangCodeGen -lclangParse -lclangSema -lclangStaticAnalyzerFrontend -lclangStaticAnalyzerCheckers -lclangStaticAnalyzerCore -lclangAnalysis -lclangARCMigrate -lclangRewrite -lclangRewriteFrontend -lclangEdit -lclangAST -lclangASTMatchers -lclangLex -lclangBasic -lclang -lLLVM-13 -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /usr/bin/../lib/gcc/x86_64-linux-gnu/9/crtend.o /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crtn.o
JIT session error: Symbols not found: [ __dso_handle ]
clang interpreterFailed to materialize symbols: { (<main>, { $.hello.cxx.__inits.0, main, my_hook }) }

I'll see if other std namespace stuff also fails. It does not.

I'll see if there is a workaround for the iostreams issue.

gogins commented 2 years ago

This is starting to actually work. There are some issues:

gogins commented 2 years ago

https://stackoverflow.com/questions/68869921/lli-is-generating-run-time-error-for-clang-generated-ir-while-the-generated-ex https://www.llvm.org/docs/MCJITDesignAndImplementation.html

Wow, this is wild. I need to track stuff like this: https://blog.llvm.org/posts/2021-03-25-cling-beyond-just-interpreting-cpp/

gogins commented 2 years ago

Okay, I'm beginning to understand the design of the compiler. With On-Request Compilation (ORC, the orc namespace), the IR is compiled and linked when a symbol is looked up! My adapted code is using the llvm::orc::SimpleCompiler defined here: https://llvm.org/doxygen/CompileUtils_8h_source.html.

Doesn't help with <iostream> though.

gogins commented 2 years ago

There needs to be a way to define opcodes that does not mess with the existing opcode list, and another opcode that calls such defined opcodes. But this is complicated by the flexibility of opcode signatures.

I would like to have a polymorphic call... I've seen this so many times and implemented it myself once or twice. There's always a generic call that takes parameters that carry their own types. I don't like that here.

There could be polymorphism built into the opcode intypes and outtypes:

/* inarg types include the following:

   i       irate scalar
   k       krate scalar
   a       arate vector
   f       frequency variable
   w       spectral variable
   x       krate scalar or arate vector
   S       String
   T       String or i-rate
   U       String or i/k-rate
   B       Boolean k-rate
   b       Boolean i-rate; internally generated as required
   l       Label
   .       required arg of any-type
   and codes
   ?       optional arg of any-type
   m       begins an indef list of iargs (any count)
   M       begins an indef list of args (any count/rate i,k,a)
   N       begins an indef list of args (any count/rate i,k,a,S)
   o       optional i-rate, defaulting to  0
   p              "             "          1
   q              "             "         10
   v              "             "          .5
   j              "             "         -1
   h              "             "        127
   O       optional k-rate, defaulting to  0
   J              "             "         -1
   V              "             "          .5
   P              "             "          1
   W       begins indef list of Strings (any count)
   y       begins indef list of aargs (any count)
   z       begins indef list of kargs (any count)
   Z       begins alternating kakaka...list (any count)    */

/* outarg types include:
 i, k, a, S as  above
 *       multiple out args of any-type
 m       multiple out aargs
 z       multiple out kargs
 I       multiple out irate (not implemented yet)
 s       deprecated (use a or k as required)
 X       multiple args (a, k, or i-rate)     IV - Sep 1 2002
 N       multiple args (a, k, i, or S-rate)
 F       multiple args (f-rate)#
 */

So that could be intypes N and outtypes *. Presumably these are arrays of Csound types.

__

gogins commented 2 years ago

Looks like I have to follow the example of the Faust opcodes.

I believe once a symbol is linked, it can be called anywhere in the process. That means, I can define the opcode using clang_orc, but without calling csoundAppendOpcode. Then, a separate Clang opcode, say clang_opcall, can look up the relevant symbols, create and store a new instance of the actual opcode, and forward the clang_opcall::iopaddr and clang_opcall::kopadr calls to the actual opcode.

* clang_opcall S_opcode_name, ????????????????????????????????????????

This will be easier if there is a standalone llvm:: symbol lookup function, but it may need the JIT to work. If so, I will create a singleton JIT.

gogins commented 2 years ago

There needs to be a way to link or merge compiled modules. Oh, here it is: https://llvm.org/docs/ORCv2.html.

gogins commented 2 years ago

Could it be as simple as using a singleton ExecutionSession? That didn't seem straightforward, trying with singleton JITCompiler. That worked, but now I have multiply defined symbol errors. But I fixed that.

gogins commented 2 years ago

Possible solutions for appending new opcodes:

gogins commented 2 years ago

I think I will try this:

gogins commented 2 years ago

Right now, I have created an "augment_csdl" branch in the Csound repository. In this branch I will ensure that csdl.h supports the following functionality to all plugin clients including of course clang_orc:

gogins commented 2 years ago

The code compiles but either crashes, or runs only in valgrind. The implication is memory corruption. I may have to try a simpler opcode, the Faust guitar is pretty sophisticated. But the guitar, in valgrind, works quite properly!

That proves the code logic is essentially correct.

I must find out where the bug is, and there are three tangled layers where it could be hiding: Csound, the Faust-generated C++ code, and my clang_orc opcode.

Valgrind reports problems in Csound, so maybe I should try fixing those. Note, Valgrind does not report errors from the LLVM layer, this must have been worked over pretty damn thoroughly. There are only a few errors from Csound:

==32989== Conditional jump or move depends on uninitialised value(s)
==32989==    at 0x724DB89: csoundModuleInit (rtalsa.c:1944)
==32989==    by 0x4A5E35F: csoundInitModule (csmodule.c:666)
==32989==    by 0x4A5E47A: csoundInitModules (csmodule.c:721)
==32989==    by 0x48CE78C: csoundReset (csound.c:3557)
==32989==    by 0x48CEFA9: csoundCreate (csound.c:1362)
==32989==    by 0x10A677: main (csound_main.c:322)
==32989== 
==32989== Conditional jump or move depends on uninitialised value(s)
==32989==    at 0x724DBB2: csoundModuleInit (rtalsa.c:1956)
==32989==    by 0x4A5E35F: csoundInitModule (csmodule.c:666)
==32989==    by 0x4A5E47A: csoundInitModules (csmodule.c:721)
==32989==    by 0x48CE78C: csoundReset (csound.c:3557)
==32989==    by 0x48CEFA9: csoundCreate (csound.c:1362)
==32989==    by 0x10A677: main (csound_main.c:322)
==32989== 
==32989== Conditional jump or move depends on uninitialised value(s)
==32989==    at 0x724DBCA: csoundModuleInit (rtalsa.c:1967)
==32989==    by 0x4A5E35F: csoundInitModule (csmodule.c:666)
==32989==    by 0x4A5E47A: csoundInitModules (csmodule.c:721)
==32989==    by 0x48CE78C: csoundReset (csound.c:3557)
==32989==    by 0x48CEFA9: csoundCreate (csound.c:1362)
==32989==    by 0x10A677: main (csound_main.c:322)
==32989== 
WARNING: could not open library '/usr/local/lib/csound/plugins64-6.0/libsterrain,.so' (/usr/local/lib/csound/plugins64-6.0/libsterrain,.so: undefined symbol: __pow_finite)
WARNING: could not open library '/usr/local/lib/csound/plugins64-6.0/libjsfx.so' (/usr/local/lib/csound/plugins64-6.0/libjsfx.so: undefined symbol: __pow_finite)
UnifiedCSD:  examples/trapped.csd
==32989== Conditional jump or move depends on uninitialised value(s)
==32989==    at 0x724DB89: csoundModuleInit (rtalsa.c:1944)
==32989==    by 0x4A5E35F: csoundInitModule (csmodule.c:666)
==32989==    by 0x4A5E47A: csoundInitModules (csmodule.c:721)
==32989==    by 0x4A6050D: csoundCompileArgs (main.c:306)
==32989==    by 0x4A617CC: csoundCompile (main.c:567)
==32989==    by 0x10A68F: main (csound_main.c:326)
==32989== 
==32989== Conditional jump or move depends on uninitialised value(s)
==32989==    at 0x65A41FF: csoundModuleInit (rtpa.c:886)
==32989==    by 0x4A5E35F: csoundInitModule (csmodule.c:666)
==32989==    by 0x4A5E47A: csoundInitModules (csmodule.c:721)
==32989==    by 0x4A6050D: csoundCompileArgs (main.c:306)
==32989==    by 0x4A617CC: csoundCompile (main.c:567)
==32989==    by 0x10A68F: main (csound_main.c:326)
==32989== 

Fixed those.

Csound compiles virtually without warnings now, which is fantastic. The clang_orc opcode compiles without any warnings.

Or, I could do some static code analysis.

Or, I could debug (shudder).

gogins commented 2 years ago

But now I get:

==42415== Invalid read of size 8
==42415==    at 0x1082A5B5: clang_opcode_t::init(CSOUND_*) (in /home/mkg/csound-extended/Opcodes/clang_orc/clang_orc_opcode.so)
==42415==    by 0x48E470A: init0 (insert.c:255)
==42415==    by 0x48F007F: musmon (musmon.c:301)
==42415==    by 0x4A613DB: csoundStart (main.c:562)
==42415==    by 0x10A68F: main (csound_main.c:326)
==42415==  Address 0xe779a40 is 71,616 bytes inside a block of size 71,696 free'd
==42415==    at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==42415==    by 0x48EBB49: mfree (memalloc.c:173)
==42415==    by 0x4B3CF9A: free_instrtxt (csound_orc_compile.c:973)
==42415==    by 0x4B3E2A6: engineState_merge (csound_orc_compile.c:1537)
==42415==    by 0x4B3E4A5: merge_state (csound_orc_compile.c:1574)
==42415==    by 0x4B40DD2: csoundCompileTreeInternal (csound_orc_compile.c:1857)
==42415==    by 0x4B417B2: csoundCompileOrcInternal (csound_orc_compile.c:1952)
==42415==    by 0x4985165: compile_str_i (compile_ops.c:73)
==42415==    by 0x48E470A: init0 (insert.c:255)
==42415==    by 0x48F007F: musmon (musmon.c:301)
==42415==    by 0x4A613DB: csoundStart (main.c:562)
==42415==    by 0x10A68F: main (csound_main.c:326)
==42415==  Block was alloc'd at
==42415==    at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==42415==    by 0x48EB956: mcalloc (memalloc.c:113)
==42415==    by 0x48E3ED8: instance (insert.c:2389)
==42415==    by 0x48E463D: init0 (insert.c:239)
==42415==    by 0x48F007F: musmon (musmon.c:301)
==42415==    by 0x4A613DB: csoundStart (main.c:562)
==42415==    by 0x10A68F: main (csound_main.c:326)
==42415== 
gogins commented 2 years ago

Is there a problem with merging global variables in orchestra code?

gogins commented 2 years ago

The upshot is the code crashes outside valgrind, but runs properly in valgrind but with errors such as above. I can't figure this out. It feels sort of like memory that was allocated gets deallocated and causes a crash outside valgrind, but that memory is kept around in valgrind and permits the code to run without crashing.

The two compilers are fighting each other. I will try the polymorphic opcode approach. LLVM's findSymbol should solve this. For this to work I have to hoist the compiler out of the opcode, so it can be used the other opcode.

gogins commented 2 years ago

Could be a bug in Csound's engineState_merge. But there is no way for the new opcode to come into scope until the instr0 init pass has completed or compilestr is called and its result merged.

However, the merge is happening before the instr0 compilation has completed.

I suppose I could try pushing this into a separate instrument, but I'm not sure that really helps the problem. But it's easy to try.

gogins commented 2 years ago

I have moved the Clang opcodes to their own repository.