vnmakarov / mir

A lightweight JIT compiler based on MIR (Medium Internal Representation) and C11 JIT compiler and interpreter based on MIR
MIT License
2.31k stars 148 forks source link

Faust MIR backend: failure in JIT when woking in interpreter mode #56

Open sletz opened 4 years ago

sletz commented 4 years ago

We have automatic tests for Faust DSP.

This MIR module https://gist.github.com/sletz/75bdd717a67d9836cba01a934728d371 correctly works in interpreter mode, but fails (that is does not compute the correct sequence of samples) in JIT mode in x86_64. Testing it is the m2bgives a strange ln 549: undeclared name error.

The module uses mir_min and mir_max which simply branch instd::min and std::max functions.

The be more precise: the symptom is that in JIT mode the first generated samples are correct, but then gradually diverge from the correct values, as if some accumulating errors issues were occurring. Note that "double" type is used.

vnmakarov commented 4 years ago

Stehpane, thank you for reporting this. It is difficult for me to say now how hard will it fix. I'll work on this issues on this week and hope to fix the on this week too.

vnmakarov commented 4 years ago

We have automatic tests for Faust DSP.

This MIR module https://gist.github.com/sletz/75bdd717a67d9836cba01a934728d371 correctly works in interpreter mode, but fails (that is does not compute the correct sequence of samples) in JIT mode in x86_64. Testing it is the m2bgives a strange ln 549: undeclared name error.

The problem in processing the module is that an item (e.g. .lc1 or mir_min) is used before the declaration. It is still possible to put declaration after usage but a forward declaration is needed before the usage. The error message was not adequate to understand, so I've committed a patch improving the diagnostic.

The module uses mir_min and mir_max which simply branch instd::min and std::max functions.

The be more precise: the symptom is that in JIT mode the first generated samples are correct, but then gradually diverge from the correct values, as if some accumulating errors issues were occurring. Note that "double" type is used.

There are only the following used double insns:

dadd, dgt, dlt, dmov, dmul, dsub

I've checked the generated code for all these insns. The code looks ok for me. So it seems a complicated error.

Unfortunately, I does not have enough info to reproduce the error. It would be nice to have input data for compute function which I could use it to reproduce the error.

Stephane, I would really appreciate if you give me more info to reproduce the error.

sletz commented 4 years ago
sletz commented 4 years ago

I manually fixed the harpe.mir text file here https://gist.github.com/sletz/eddb41cd9d835921434295d49ce0f82d moving the functions declaration + global constant section at the beginning of the file. Now m2b can convert it.... So my understanding is that the MIR_output doe not produce the .mir file in the correct way?

vnmakarov commented 4 years ago

Now m2b can convert it.... So my understanding is that the MIR_output doe not produce the .mir file in the correct way?

Output MIR code should be be correct. So it is a bug. After some thought, I can see at least one situation when I can produce a wrong MIR textual code: it is a MIR simplification which happens right before MIR code generation or MIR interpretation and which creates data items (they have labels starting with .l) for some insn immediate data.

In any case, I'll work to fix this bug.

sletz commented 4 years ago

Well coding a simple C program to reproduces the problem is too hard: the MIR code is actually used in an mixed interpreter/JIT module and setting the infrastructure is too hairy ((-; I'll try to create a much simpler case.

sletz commented 4 years ago

I was able to prepare a somewhat simpler test:

You'll see that the series all output samples are the same up to the index 199, then start to diverge.

Note that using float instead of double (that is general the MIR module using float and adapting the test program) shows the same problem with the exact same values.

vnmakarov commented 4 years ago

I was able to prepare a somewhat simpler test:

Stephane, thank you very much for spending your time to help me. I've reproduced the bug and see the divergence exactly on frame 199 as you wrote.

I will work on fixing the bug and let you know as soon as the fix will be in the repository.

vnmakarov commented 4 years ago

I found the problem. A typical insn sequence in question looks like

        subs    binop4, load_i644, load_i643
        i2d     cast_real2, binop4

When the result of subs is negative -1 it has 64-bit representation xxxxxxxxffffffff where x means undefined value. For interpreter high 32-bit happens to be ffffffff. For the generated code it is 00000000. Therefore id2 result is -1.0 in the interpreter and 4294967295.0 in the generated code.

MIR is assumed to work on different targets where on some targets 32-bit insns does not affect high 32-bit part and on some targets the insns can zero high 32-bit etc. For efficient code generation, MIR has semantics that value of high 32-bit result of 32-bit insns is not defined.

According MIR semantics, you should use the following code instead of the above code:

        subs    binop4, load_i644, load_i643
        ext32   binop4, binop4
        i2d     cast_real2, binop4

Thank you again for providing the full code to reproduce the problem.

sletz commented 4 years ago

OK I see.

In my code integers are 32 bits integers. Does it means that all insns dealing with 32 bits integer (basically all integer binary operations) have to systematically add the ext32 insns to only keep the low 32 bits ? (that is each time after the binop operation code generation) ? Or only before an operation like i2d ?

vnmakarov commented 4 years ago

In my code integers are 32 bits integers. Does it means that all insns dealing with 32 bits integer (basically all integer binary operations) have to systematically add the ext32 insns to only keep the low 32 bits ? (that is each time after the binop operation code generation) ? Or only before an operation like i2d ?

When you need 64-bit input value (as for i2d) which is a result of 32-bit insn, the result value should be extended (signed or unsigned depending on what you need) because high 32-bit of the result value is not defined. But if you use result of 32-bit insn as an input of another 32-bit insn, the extension is not necessary because 32-bit insns use only lower 32-bit of 64-bit input operand.

Using only 64-bit integer insns would simplify MIR semantics a lot. It was an original design. Unfortunately it is hard to implement efficient code for 32-bit arithmetic with such model. It would require extensions for each input operand. So I introduced 32-bit insns later.

The more important reason for this was a possibility to generate efficient 32-bit arithmetic code for 32-bit targets. Although the world is moving to 64-bit architectures, it is still nice to have such possibility. Having 32-bit insns permits to implement efficient 32-bit arrithmetic code with a minor optimization which is currently absent. Having only 64-bit insns complicates this task a lot.

Please just remember that integer-type variables (even func args) are always 64-bit and it will make MIR-semantics more understandable. Long ago I thought about introducing 32-bit variables too but it would complicate MIR-semantics even more. In any case there is no perrfect IR design. Any solutions have own pros and cons. My major goal was to make MIR compact as possible still providing possibility to generate an efficient code for 32-bit and 64-bit targets.

sletz commented 4 years ago

Thanks. I fixed the code generator and it works now.