Open yegor256 opened 2 weeks ago
@volodya-lombrozo please, check this out. I found it here: https://github.com/objectionary/hone-maven-plugin/actions/runs/10663836285/job/29553850114
@volodya-lombrozo btw, jeo:disassemble generates this (which is correct):
...
<o abstract="" name="j$foo-KClE">
<o base="int" data="bytes" line="592052178" name="access">00 00 00 00 00 00 00 00</o>
<o base="string" data="bytes" line="859086335" name="descriptor">28 29 44</o>
<o base="string" data="bytes" line="821201258" name="signature"/>
...
@yegor256 I'm still trying to reproduce the problem now. But to be honest, <o base="org.eolang.bytes" data="bytes"/>
isn't such a good object. What this originally should be? null
value? 0
value? If it's an object, it's ok, but if its a primitive value like int
or byte
- we can't allow it.
@volodya-lombrozo this EO code (it's empty byte-array):
send --
must render in XMIR as such:
<o base="send">
<o base="org.eolang.bytes" data="bytes"/>
</o>
It's perfectly legal.
@yegor256 I've finally checked the entire logs of this job https://github.com/objectionary/hone-maven-plugin/actions/runs/10663836285/job/29553850114
And it seems, that you completely skip opeo
step.
From logs:
....
jeo:disassemble
....
eo:xmir-to-phi
....
eo:phi-to-xmir
....
jeo:assemble
....
Here is the problem. xmir-to-phi
(instead of just printing PHI expressions,) also dramatically changes the original XMIR
("optimises" it), so the new representation of XMIR isn't suitable for jeo
.
@volodya-lombrozo yes, this is true, but the problem still exists: jeo:assemble must understand this XMIR correctly. (at least because jeo:disassemble produces it 😄 )
@yegor256 It's not jeo:disassemble
produces it. eo:xmir-to-phi
and eo:phi-to-xmir
produce this. If you run your pipeline without them:
....
jeo:disassemble
....
....
jeo:assemble
....
You will see, that everything works fine (without exceptions.)
@volodya-lombrozo checkout this repo, run mvn clean test
and then see the content of target/simple-app/target/generated-sources/
directory. In the jeo-disassemble/
directory, you will find Hello.xmir
file with this line inside: <o base="string" data="bytes" line="1094858495" name="signature"/>
. If I'm not mistaken, this file is generated by jeo:disassemble
.
Regardless of who generates it, an empty byte-array is a legal thing in EO (just like empty strings are legal in other languages). JEO should not fail on them.
@yegor256
git clone git@github.com:objectionary/hone-maven-plugin.git
cd hone-maven-plugin
mvn clean test
[ERROR] Tests run: 7, Failures: 2, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
find . -name "Hello.xmir"
<nothing is found>
@volodya-lombrozo I'm getting this:
$ find . -name Hello.xmir
./target/simple-app/target/generated-sources/unphi/Hello.xmir
./target/simple-app/target/generated-sources/jeo-disassemble/Hello.xmir
@volodya-lombrozo do you have docker
installed? You should, in order to get the output I'm getting.
@yegor256 Yes, I do have it:
docker --version
Docker version 20.10.11, build dea9396
@volodya-lombrozo anyway, let's skip the part "why this happens in XMIR?" Let's just make JEO not fail on this XMIR, since it's a valid XMIR. How it's generated and why - shouldn't matter.
@yegor256 Maybe It's a valid XMIR
, but it has invalid format for jeo
.
@volodya-lombrozo we have only one XMIR format :) It must work for all tools. Empty byte-arrays are valid elements of EO/XMIR.
@volodya-lombrozo try to use this EO program:
# this is object.
[] > app
QQ.io.stdout > @
""
Then, compile it with eoc
. You will get this app.xmir
:
<?xml version="1.0" encoding="UTF-8"?>
<program dob="2024-05-22T13:54:22"
ms="84"
name="app"
revision="48e8be4"
source="/Volumes/sec/code/tmp/eo/app.eo"
time="2024-09-02T13:22:41.335753Z"
version="0.38.2"><!--This is XMIR - a dialect of XML, which is used to present a parsed EO program. For more information please visit https://news.eolang.org/2022-11-25-xmir-guide.html-->
<listing># this is object.
[] > app
QQ.io.stdout > @
""
</listing>
<errors>
<error check="comment-length-check" line="1" severity="warning">Comment must be at least 64 characters long</error>
<error check="comment-start-character-check" line="1" severity="warning">Comment must start with capital letter</error>
</errors>
<sheets/>
<license/>
<metas/>
<objects>
<o abstract="" line="2" name="app" pos="0">
<o base="QQ" line="3" pos="2"/>
<o base=".io" line="3" method="" pos="4"/>
<o base=".stdout" line="3" method="" name="@" pos="7">
<o base="string" data="bytes" line="4" pos="4"/>
</o>
</o>
</objects>
</program>
See the line:
<o base="string" data="bytes" line="4" pos="4"/>
@yegor256 I'm not against that jeo
should understand the <o base="string" data="bytes" line="4" pos="4"/>
line :)
Moreover I use such lines extensively in the project:
git clone git@github.com:objectionary/jeo-maven-plugin.git
cd jeo-maven-plugin
mvn clean install
grep -r -E '<o base="string" data="bytes" line="[^"]*" name="signature"/>' .
@volodya-lombrozo please, fix the bug then. I can't proceed with the HONE plugin because of it. Also, keep in mind that HONE plugin is going to use this pipeline for now: Bytecode -> JEO -> Normalizer -> JEO -> Bytecode. No OPEO involvement for now.
@yegor256 How? There is no bug in jeo
. It's a bug in phi/unphi
that breaks the format of XMIR
. Let me try to explain it differently.
jeo:disassemble
generates the following "signature":<o base="string" data="bytes" line="727201432" name="signature"/>
phi/unphi
transforms this signature to the something like the following (approximately):<o base=".string" name="signature">
<o base=".eolang">
<o base=".org">
<o base="Q"/>
</o>
</o>
<o as="0" base=".bytes">
<o base=".eolang">
<o base=".org">
<o base="Q"/>
</o>
</o>
<o base="org.eolang.bytes" data="bytes"/>
</o>
</o>
jeo:assemble
still expects <o base="string" data="bytes" line="727201432" name="signature"/>
or maybe I don't understand something?
@volodya-lombrozo now I understand. The problem is that jeo:disassemble
uses old representation of objects in XMIR. This is not valid:
<o base="string" data="bytes" line="1739939615" name="descriptor">28 29 56</o>
It should be this:
<o base=".string" name="descriptor">
<o base=".eolang">
<o base=".org">
<o base="Q"/>
</o>
</o>
<o as="0" base=".bytes">
<o base=".eolang">
<o base=".org">
<o base="Q"/>
</o>
</o>
<o base="org.eolang.bytes" data="bytes">28 29 56</o>
</o>
</o>
This is how it looks in EO:
Q.org.eolang.string > descriptor
Q.org.eolang.bytes:0
28-29-56
We switched to this new format a few minor versions of EO ago. Now, we can't do base="string"
or base="int"
, only base="org.eolang.bytes"
and base="Q"
.
@yegor256 Do you have documentation on how XMIR
elements should look? Maybe a specification, decisions made, or at least something that I can use to develop these transformations?
Moreover, I need a clear understanding—not of the "initial" representation of XMIR
. As I’ve already mentioned, the eo-maven-plugin
applies many different optimizations that change this XMIR
dramatically.
https://github.com/objectionary/jeo-maven-plugin/issues/687#issuecomment-2324297355
When you do phi/unphi
, the eo-maven-plugin
actually applies these optimizations.
https://github.com/objectionary/eo/issues/3257#issuecomment-2272882054
I don’t think you can predict how XMIR will look after these optimizations. Do you expect me to predict it?
Moreover, tomorrow you could invent an entirely different "view" on XMIR
and how some objects should be represented, and all my findings might be thrown into the bin. I’ve already done this "rewriting" many times:
https://github.com/objectionary/jeo-maven-plugin/issues/656 https://github.com/objectionary/jeo-maven-plugin/issues/627
That’s only in jeo-maven-plugin
. I’m not even mentioning opeo-maven-plugin
.
So, to be honest, I’m extremely frustrated with these requirements. And even if you insist on them and decide to "rewrite" it again, I can do it, but you should be aware that it might take a significant amount of time.
@volodya-lombrozo I understand. How about we make JEO more strict by writing the documentation for it. Basically, we must explain to the users of JEO-output what is possible to do with it in order to keep it understandable by JEO. If you publish such a documentation, we will make sure our normalizations/optimizations send to JEO only the XMIR it is prepared for.
@yegor256 I agree. What do you think if we move a bit further and invent some sort of XSD
for it?
@volodya-lombrozo XSD is a rather weak format, it won't allow you to define complex rules. It's mostly about types of data. Maybe a collection of XPath assertions (both "positive" that must be in the incoming XMIR and "negative" that are not allowed). Just an idea.
I'm getting this after the following element in XMIR:
This line (
<o base="org.eolang.bytes" data="bytes"/>
), I believe, is a legal representation of an empty chain of bytes. It must be correctly processed by JEO.