Open volodya-lombrozo opened 6 days ago
@yegor256 @maxonfjvipon Could you take a look, please?
@volodya-lombrozo as a general rule, don't use base="abc"
, since there is no such thing as abc
in global scope. Inside base
you may have either Q
, $
, or something that starts with a dot. @maxonfjvipon maybe we should make a strict rule for all XMIR files?
@volodya-lombrozo the XMIR in "was" section we get after parsing step. It's raw and not ready for any manipulations, including converting to phi. It does not have object FQNs starting with Q
or $
, it has some secondary attributes (like star
which helps to indicate the place where we should unwrap tuple
) and so on. That's why PhiMojo
includes the special flag phiOptimize
which is true
by default and which makes XMIR ready to converting to phi by applying optimize
step.
So, in such scenario XMIRs won't ever be equal (and actually they shouldn't).
I think you can try to include optimize
step explicitly to your pipeline and disable phiOptimize
on xmir-to-phi
step.
bytecode -> (disassemble) xmir
-> optimize
-> phi
-> unphi
-> xmir
(assemble) -> bytecode.
Here you can try to compare XMIRs after optimize
and unphi
steps. They should be the same (but I'm not sure)
@maxonfjvipon
That's why PhiMojo includes the special flag phiOptimize which is true by default and which makes XMIR ready to converting to phi by applying optimize step.
If it is so, it means, that PhiMojo
already optimises xmir
. What does the following mean then?
I think you can try to include optimize step explicitly to your pipeline and disable phiOptimize on xmir-to-phi step.
phi
already has an optimization so the current pipeline already looks like the following:
bytecode -> (disassemble) xmir -> (optimize) phi -> unphi -> xmir (assemble) -> bytecode.
What do you mean by "include optimize step explicitly to your pipeline"? And why should I disable phiOptimize
?
Also, please, pay attention, it's not "my" pipeline. I got it from here. The integration test for jeo-maven-plugin
only tries to reproduce the same steps.
@maxonfjvipon I can try to generate "optimised" xmir
, but in this case I have to know "what" to generate. Any spec, description or something similar is needed. Also, as I see it, "optimised" xmir
extremely hard to read and understand, so I'm afraid it might create some problems in the future.
@volodya-lombrozo
What do you mean by "include optimize step explicitly to your pipeline"? And why should I disable phiOptimize?
Now your pipeline is: bytecode -> (disassemble) xmir ->>> (optimize + to-phi) phi -> unphi ->>> xmir (assemble) -> bytecode.
Triple arrows ->>>
- the places where you takes XMIRs to compare. So here you compares XMIR after disassembling and XMIR after optimize
-> phi
-> unphi
. So they are different.
PhiMojo
already optimises xmir PhiMojo includesoptimize
step, but for you it looks like one stepphi
The suggested pipeline: bytecode -> (disassemble) xmir -> (optimize) xmir ->>> (no optimize) phi -> unphi ->>> xmir (assemble) -> bytecode.
Here you explicitly callsoptimize
step, takes XMIR after it (1);
then do phi
without optimization, unphi
and takes result XMIR (2)
Then you can compare XMIR (1) and XMIR (2). They should be the same and you test should work
@volodya-lombrozo
I can try to generate "optimised" xmir, but in this case I have to know "what" to generate. Any spec, description or something similar is needed.
You just need to call "optimize" goal of eo-maven-plugin
@maxonfjvipon Why do I need to compare this xmir
? What is the point of this comparison?
Ok, let me explain the problem one more time:
The jeo-maven-plugin
should be able to "understand" the output from phi/unphi
—this is the main goal. Currently, the jeo-maven-plugin
can only understand a particular fixed xmir
structure:
(disassemble) xmir
That's it. If I apply any optimisation (at any point), this xmir
becomes a mess, and the jeo-maven-plugin
cannot understand it.
I'm looking for a way to make jeo-maven-plugin
understand the output from phi/unphi
.
I see only two paths for now:
jeo-maven-plugin
generates and accepts an already optimised xmir
.phi/unphi
doesn’t optimise the fixed parts of xmir
generated by the jeo-maven-plugin
.@volodya-lombrozo PhiMojo
can't work with not optimized XMIR because phi calculus is more strict structure than EO or XMIR. So either jeo-maven-plugin
should learn to work with optimized XMIR, or we can try to make one more mojo that would translate optimized XMIR to non optimized (that looks similar to XMIR after disassembling) but I'm not sure it should be part of eo-maven-plugin
. @yegor256 WDYT?
@volodya-lombrozo let's stop using tuple
in JEO output and only use seqXX
. Thus, your output will look like this:
<o abstract="" name="object@init@-KClW">
<o base="seq48" name="@">
<o base="opcode" line="1913064591"/>
<o base="opcode" line="1913064591"/>
<o base="opcode" line="1913064591"/>
<o base="opcode" line="1913064591"/>
...
</o>
</o>
</o>
Here, 48
is the number of objects encapsulated by it. We don't have varargs in EO, but we can have many seq
objects: seq13
, seq444
, etc.
This will make your output very close to the output of unphi
and you will easily tune it, to make them 100% identical.
@yegor256 Sure. At least it's a good first step. Thank you. As soon, as I implement this, I will send you updated files. However, I wanted to emphasise, that we still have the main problem with "dot notation":
<o base="seq" name="@">
</o>
vs.
<o base=".seq" name="@">
</o>
It would be great to understand the semantic difference between both of them
@volodya-lombrozo first of all, as Yegor said above there is no such object seq
in global scope. All object FQNs start with either Q
or $
. Here Q
- global scope, $
- scope of current abstract object.
So seq
anyway will get a prefix org.eolang
or Q.org.eolang
. So you simple object seq
is converted to the sequence of dispatches (method calls) Q.org.eolang.seq
.
Globally, There are 2 different notations for dispatching, which are used in EO:
# horizontal
org.eolang.seq
Q.org.eolang.seq
$.instructions
org .eolang .seq
Q .org .eolang .seq
$ .instructions
In XMIR it may look like:
```xml
<o base="org.eolang.seq"/>
// OR
<o base="Q.org.eolang.seq"/>
// OR
<o base="Q"/>
<o base=".org" method=""/>
<o base=".eolang" method=""/>
<o base=".seq" method=""/>
// OR
<o base="$"/>
<o base=".instructions" method=""/>
At the level of XMIR such notation is not canonical and during optimize
it will be transformed into the second one.
seq.
eolang.
org
seq. eolang. org. Q
instructions. $
It works the same as direct notation but it looks like the sequence of applications. At the level of EO every object that is used as method ENDS with dot.
In XMIR it look like:
```xml
<o base=".seq">
<o base=".eolang">
<o base=".org">
<o base="Q"/>
</o>
</o>
</o>
// OR
<o base=".seq">
<o base=".eolang">
<o base="org"/>
</o>
</o>
// OR
<o base=".instructions">
<o base="$"/>
</o>
Here all objects that are used as methods STARTS with dot.
Such notation is canonical. All XMIRs returned by unphi
100% are in reversed notation. So if you teach JEO to generate/understand XMIR in such notation - you'll succeed
@volodya-lombrozo So this it how JEO should print in order to not get into a trouble:
<o base=".seq48">
<o base=".eolang">
<o base=".org">
<o base="Q"/>
</o>
</o>
<o base="opcode" line="1913064591"/>
<o base="opcode" line="1913064591"/>
<o base="opcode" line="1913064591"/>
<o base="opcode" line="1913064591"/>
...
</o>
@yegor256 @maxonfjvipon Now jeo-maven-plugin
uses seq
objects instead of tuple
objects: https://github.com/objectionary/jeo-maven-plugin/issues/707. So, I restarted phi/unphi
integration test again and got the following exception:
As you can see, the exception message is still rather cryptic. I didn't grasp the idea about the problem. However the problem is still on eo-maven-plugin
side.
Updated files for each transformation stage:
App.phi.txt App.xmir.disassemble.txt App.xmir.unphi.txt
Hope, it will help.
@maxonfjvipon Could you send the example of xmir
from the seminar?
@volodya-lombrozo here it is:
<o base="Q.org.eolang.seq48">
<o base="Q.org.eolang.jeo.opcode" line="1913064591"/>
<o base="Q.org.eolang.jeo.opcode" line="1913064591"/>
<o base="Q.org.eolang.jeo.opcode" line="1913064591"/>
<o base="Q.org.eolang.jeo.opcode" line="1913064591"/>
...
</o>
@maxonfjvipon I should add Q.org.eolang
package for all the eolang
objects (like seq
) and Q.org.eolang.jeo
package for jeo
objects, like opcode
, am I right?
@volodya-lombrozo let's just add org.eolang
or org.eolang.jeo
without Q
@maxonfjvipon All of the rest is ok?
@volodya-lombrozo I think so
@maxonfjvipon I've tried to add org.eolang
and org.eolang.jeo
packages to the each generated object according with the previous comment.
I attach generated files by phi/unphi
(with packages):
App.phi.txt
App.xmir.disassemble.txt
App.xmir.unphi.txt
Please, let me know if they are suitable for you. Blocks: https://github.com/objectionary/jeo-maven-plugin/pull/713
@volodya-lombrozo I took your App.xmir.unphi.txt
after unphi
, printed it to EO, parsed by EO parser and applied unroll-bases.xsl
5 times (which is in development) to result XMIR. Here's what I've got:
Is it enough for you? Please let me know if there's something wrong
@maxonfjvipon I will take a closer look later. However, even for now I see, that resulting unroll.xml
is different from what we have in the original xmir
. For example, opcodes are placed in the wrong place:
<o base="org.eolang.seq31" line="215" name="@" pos="4"/>
seq31
name suggests that we should have 31
opcodes inside, but how you can see, it's an empty object.
@volodya-lombrozo how about this? unroll.xmir.txt
@maxonfjvipon Again, we have such an excerpt from the unroll.xmir.txt
file:
<o base="org.eolang.seq6" line="77" name="@" pos="4">
<o as="0" base="org.eolang.jeo.label" line="82" pos="6">
<o as="0" base="bytes" data="bytes" line="83" pos="8">32 35 36 62 30 35 61 31 2D 32 30 30 39 2D 34 34 61 39 2D 39 39 64 39 2D 64 63 61 30 66 61 33 39 39 33 62 39</o>
</o>
<o as="1" base="aload-1" line="85" pos="6"/>
<o as="2" base="invokespecial-2" line="87" pos="6"/>
<o as="3" base="return-3" line="89" pos="6"/>
<o as="4" base="org.eolang.jeo.label" line="94" pos="6">
<o as="0" base="bytes" data="bytes" line="95" pos="8">38 66 65 30 31 66 35 61 2D 65 34 65 61 2D 34 66 36 34 2D 62 39 31 34 2D 37 37 31 65 31 61 64 37 35 38 38 37</o>
</o>
</o>
As you can see, we have the seq6
object as a top element. 6
suggests that we must have exactly six objects inside of it. However, we have only five of them. So "unrolling" either moved some opcodes into a separate place or removed them entirely. You can take the App.xmir.disassemble.txt
file as a golden xmir
. Ideally you would get exactly the same xmir
after unrolling.
I understand that, in the process you can add some attributes like as="4"
, it's fine, but when you move some components across the file - it becomes a problem: I just don't know where to find them.
@volodya-lombrozo FYI, in your "golden" file seq6
has only 5 inner <o>
objects and seq31
has only 27, please count it manually and you'll see
@maxonfjvipon Thank you! You helped me to find the bug. Fixed it: App.xmir.disassemble.txt
@volodya-lombrozo in your example you have
<o base="org.eolang.jeo.int" name="access" data="bytes" line="1656645624">00 00 00 00 00 00 00 03</o>
After unphi
and my XSLs it looks like:
<o base="org.eolang.jeo.int" line="33" name="access" pos="4">
<o as="0" base="bytes" data="bytes" line="34" pos="6">00 00 00 00 00 00 00 04</o>
</o>
Is it ok for you or I need to convert it back too?
@maxonfjvipon It would be super-cool. By doing this you would be able to move forward much faster with other issues and optimizations. I wouldn't block you in this case. Otherwise you will need to wait when I finish with bytes
support: https://github.com/objectionary/jeo-maven-plugin/issues/715
@volodya-lombrozo how about this one: unroll.xmir.txt
What else am I missing?
@volodya-lombrozo so how it should work:
unphi
you convert result XMIR to EO via PrintMojo
or Xmir
objecteo-parser
(ParseMojo
or EoSyntax
class)new TrJoined<>(
new TrClasspath<>(
"/org/eolang/parser/wrap-method-calls.xsl"
).back(),
new TrDefault<>(
new StEndless(
new StClasspath(
"/org/eolang/parser/roll-bases.xsl"
)
)
),
new TrClasspath<>(
"/org/eolang/parser/add-refs.xsl",
"/org/eolang/parser/vars-float-down.xsl",
"/org/eolang/parser/roll-data.xsl"
).back()
)
All the transformations are: wrap-method-calls.xsl.txt roll-bases.xsl.txt add-refs.xsl.txt vars-float-down.xsl.txt roll-data.xsl.txt
After applying such pipeline to your example I got the XMIR from the comment above
I run the following integration test:
bytecode -> (disassemble)
xmir
->phi
->unphi
->xmir
(assemble) -> bytecode.And this test fails because
phi/unphi
alter the originalxmir
file.Steps to reproduce:
1) I generate
App.xmir
fromApp.class
file (jeo:disassemble
): App.xmir.disassemble.txt 2) Then I useeo:0.39.0:xmir-to-phi
to generateApp.phi
: App.phi.txt 3) Then I runeo:0.39.0:phi-to-xmir
to generateApp.xmir
(and it is generated): App.xmir.unphi.txtExpected behaviour:
App.xmir.disassemble.txt
andApp.xmir.unphi.txt
files should be the same. In other words,phi/unphi
does not have to change the originalxmir
file.Actual behaviour:
App.xmir.disassemble.txt
andApp.xmir.unphi.txt
files are different.phi/unphi
significantly changes the originalxmir
.Details:
Was:
Became:
App.phi.txt App.xmir.disassemble.txt App.xmir.unphi.txt