vnmakarov / mir

A lightweight JIT compiler based on MIR (Medium Internal Representation) and C11 JIT compiler and interpreter based on MIR
MIT License
2.26k stars 145 forks source link

WASM-->MIR or LLVM IR-->MIR #66

Open xwang98 opened 4 years ago

xwang98 commented 4 years ago

It is great to find this project! we are developing JIT/AOT support for WebAssembly Micro Runtime project (https://github.com/bytecodealliance/wasm-micro-runtime) based on LLVM currently. The very slow compilation by LLVM is a significant issue for some usages. The MIR seems to a good solution to our problem.

We see a quick path would be convert the LLVM IR that we converted from WASM to MIR, and leverage MIR to do the quick compilation and code generation. So the question here is what the progress of conversion from LLVM IR to MIR?

Later we could directly covert the WASM to MIR if everything is going well.

vnmakarov commented 4 years ago

We see a quick path would be convert the LLVM IR that we converted from WASM to MIR, and leverage MIR to do the quick compilation and code generation. So the question here is what the progress of conversion from LLVM IR to MIR?

This work is postponed right now. LLLVM IR to MIR translator works in simple cases. But full implementation requires translation of llvm IR structures and vectors and this needs some translator refactoring. Currently I am busy with other projects. May be I'll return to LLVM IR to MIR translator later in 2 months.

For my goals (Ruby JIT) I need C to MIR translator and c2mir (a C to MIR compiler implementation from the scratch) works for me.

Later we could directly covert the WASM to MIR if everything is going well.

I think it is more double approach than WASM->LLVM->MIR approach because usage of LLVM is overkill and because LLVM IR is not stable.

You could also try to use WASM->C->MIR approach where c2mir is used for C->MIR conversion.

MIR project is still on early stages of development. There is even no any release yet (I am planning to make the first one this summer).

Another issue for you can be that MIR right now supports only two architectures x86-64 and aarch64. Currently I am working on ppc64 port which I guess will be ready in 2-3 weeks. Porting MIR to 32-bit target (like i386 or arm) would be a bigger challenge.

xwang98 commented 4 years ago

Thank you! @vnmakarov

I think it is more double approach than WASM->LLVM->MIR approach because usage of LLVM is overkill and because LLVM IR is not stable.

Could you please explain a bit more about it? I am not sure if I totally understand it. You don't think WASM-->MIR is a better approch than WASM-->LLVM-->MIR, is it?

WASM already has good community support for the compilation of multiple frontend languages. I see the MIR is very helpful for the WASM runtimes to run WASM in JIT. Then you even has no need to convert the c to MIR since the clang supports c-->WASM pretty well.

The use case that has problem with the LLVM compilation speed is on X86-64. In this case we need to compile a large WASM file with a latency requirement.

We will watch your project and expect your first release in this summer :-)

vnmakarov commented 4 years ago

Could you please explain a bit more about it? I am not sure if I totally understand it. You don't think WASM-->MIR is a better approch than WASM-->LLVM-->MIR, is it?

Sorry, if i was not clear. I meant that direct translation of WASM to MIR is better approach than through LLVM IR. LLVM IR is changed from release to release and its usage would require efforts to maintain WASM->LLVM IR->MIR translator. It is not easy to compile LLVM IR to MIR too. I started this work but it is far from finishing.

WASM already has good community support for the compilation of multiple frontend languages. I see the MIR is very helpful for the WASM runtimes to run WASM in JIT. Then you even has no need to convert the c to MIR since the clang supports c-->WASM pretty well.

I already has C to MIR translator (please see c2mir directory). It was easier for me to implement it than it would be to do the same for LLVM IR translator. Of course, C I implemented does not have all extensions what GNU C or Clang have. But it is enough for my purposes.

I think currently you could translate WASM to standard C and then use c2mir (it can be used a library too and it is pretty small) to get MIR and then JITed code. So it would look like WASM->C->MIR. C2MIR is pretty fast at least faster than GNUC and Clang. So this way of getting MIR might satisfy your latency requirements. I guess this approach is easier than direct WASM to MIR compiler although I would say WASM to MIR translator should be easy too as the languages are very similar.

The use case that has problem with the LLVM compilation speed is on X86-64. In this case we need to compile a large WASM file with a latency requirement.

OK. If you need only x86-64, it is good.

vnmakarov commented 4 years ago

I've just found the mistake. Instead of "doable approach" I typed "double approach". This is probably the source of misunderstanding.

xwang98 commented 4 years ago

I've just found the mistake. Instead of "doable approach" I typed "double approach". This is probably the source of misunderstanding.

Yes, I was guessing you might mean some other word.

xwang98 commented 4 years ago

I think currently you could translate WASM to standard C and then use c2mir (it can be used a library too and it is pretty small) to get MIR and then JITed code. So it would look like WASM->C->MIR. C2MIR is pretty fast at least faster than GNUC and Clang. So this way of getting MIR might satisfy your latency requirements. I guess this approach is easier than direct WASM to MIR compiler although I would say WASM to MIR translator should be easy too as the languages are very similar.

It is unusal to see converting bytecode to C and we might not follow it. We will evaluate WASM-->MIR path. I am still curious any good library for supporting generating C from other IRs?

jasl commented 4 years ago

I think currently you could translate WASM to standard C and then use c2mir (it can be used a library too and it is pretty small) to get MIR and then JITed code. So it would look like WASM->C->MIR. C2MIR is pretty fast at least faster than GNUC and Clang. So this way of getting MIR might satisfy your latency requirements. I guess this approach is easier than direct WASM to MIR compiler although I would say WASM to MIR translator should be easy too as the languages are very similar.

It is unusal to see converting bytecode to C and we might not follow it. We will evaluate WASM-->MIR path. I am still curious any good library for supporting generating C from other IRs?

vnmakarov helped CRuby implement JIT by Ruby byte code -> C code - (GCC) -> ASM, I think that's why he decided to introducing MIR to make the flow more efficiency.

This looks a trick way but Ruby MJIT has proved this is possible.

jasl commented 4 years ago

BTW, My friend introduce me wasm2c: Convert wasm files to C source and header today