Closed julianthome closed 5 years ago
Hi @sjiya, I have renamed this issue ;). Please always feel free to create new issues that match your problem description. The inmemantlr tree is pretty generic and contains all the information you are looking for regarding the grammar rules and the matched tokens. However, if you want to obtain the reduced parse tree you are asking for, I would execute the following steps:
Thanks for prompt reply. plz check attached code and generated tree. I am unable to print AST from this code. I would appreciate if you suggest changes in my code. TreeTest.zip
Thanks @sjiya, would it be also possible for you to share the grammar you have used in the example?
I am attaching Grammar file here.
Regards, Sharaf Hussain
On Fri, Feb 8, 2019 at 12:00 PM Julian Thomé notifications@github.com wrote:
Thanks @sjiya https://github.com/sjiya, would it be also possible for you to share the grammar you have used in the example?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/julianthome/inmemantlr/issues/35#issuecomment-461710042, or mute the thread https://github.com/notifications/unsubscribe-auth/ATUL2oduP7trGhXcEMwEwkOKSmLC55VYks5vLSCBgaJpZM4XTxF2 .
Hi @sjiya, your code seems works on my machine ... this is the output:
Tree Node Size: 64
graph {
node [fontname=Helvetica,fontsize=11];
edge [fontname=Helvetica,fontsize=10];
n1 [label="root",shape=ellipse];
n2 [label="math",shape=ellipse];
n3 [label="relation",shape=ellipse];
n4 [label="relation",shape=ellipse];
n5 [label="expr",shape=ellipse];
n6 [label="additive",shape=ellipse];
n7 [label="mp",shape=ellipse];
n8 [label="unary",shape=ellipse];
n9 [label="postfix",shape=ellipse];
n10 [label="exp",shape=ellipse];
n11 [label="comp",shape=ellipse];
n12 [label="atom",shape=ellipse];
n11 -- n12;
n10 -- n11;
n9 -- n10;
n8 -- n9;
n7 -- n8;
n6 -- n7;
n5 -- n6;
n4 -- n5;
n3 -- n4;
n13 [label="relation",shape=ellipse];
n14 [label="expr",shape=ellipse];
n15 [label="additive",shape=ellipse];
n16 [label="mp",shape=ellipse];
n17 [label="unary",shape=ellipse];
n18 [label="postfix",shape=ellipse];
n19 [label="exp",shape=ellipse];
n20 [label="comp",shape=ellipse];
n21 [label="frac",shape=ellipse];
n22 [label="expr",shape=ellipse];
n23 [label="additive",shape=ellipse];
n24 [label="mp",shape=ellipse];
n25 [label="unary",shape=ellipse];
n26 [label="unary",shape=ellipse];
n27 [label="postfix",shape=ellipse];
n28 [label="exp",shape=ellipse];
n29 [label="comp",shape=ellipse];
n30 [label="atom",shape=ellipse];
n29 -- n30;
n28 -- n29;
n27 -- n28;
n26 -- n27;
n31 [label="postfix",shape=ellipse];
n32 [label="exp",shape=ellipse];
n33 [label="comp",shape=ellipse];
n34 [label="atom",shape=ellipse];
n33 -- n34;
n32 -- n33;
n31 -- n32;
n26 -- n31;
n35 [label="postfix",shape=ellipse];
n36 [label="exp",shape=ellipse];
n37 [label="comp",shape=ellipse];
n38 [label="func",shape=ellipse];
n39 [label="expr",shape=ellipse];
n40 [label="additive",shape=ellipse];
n41 [label="additive",shape=ellipse];
n42 [label="mp",shape=ellipse];
n43 [label="unary",shape=ellipse];
n44 [label="postfix",shape=ellipse];
n45 [label="exp",shape=ellipse];
n46 [label="exp",shape=ellipse];
n47 [label="comp",shape=ellipse];
n48 [label="atom",shape=ellipse];
n47 -- n48;
n46 -- n47;
n45 -- n46;
n49 [label="atom",shape=ellipse];
n45 -- n49;
n44 -- n45;
n43 -- n44;
n42 -- n43;
n41 -- n42;
n40 -- n41;
n50 [label="additive",shape=ellipse];
n51 [label="mp",shape=ellipse];
n52 [label="unary",shape=ellipse];
n53 [label="postfix",shape=ellipse];
n54 [label="exp",shape=ellipse];
n55 [label="comp",shape=ellipse];
n56 [label="atom",shape=ellipse];
n55 -- n56;
n54 -- n55;
n53 -- n54;
n52 -- n53;
n51 -- n52;
n50 -- n51;
n40 -- n50;
n39 -- n40;
n38 -- n39;
n37 -- n38;
n36 -- n37;
n35 -- n36;
n26 -- n35;
n25 -- n26;
n24 -- n25;
n23 -- n24;
n22 -- n23;
n21 -- n22;
n57 [label="expr",shape=ellipse];
n58 [label="additive",shape=ellipse];
n59 [label="mp",shape=ellipse];
n60 [label="unary",shape=ellipse];
n61 [label="postfix",shape=ellipse];
n62 [label="exp",shape=ellipse];
n63 [label="comp",shape=ellipse];
n64 [label="atom",shape=ellipse];
n63 -- n64;
n62 -- n63;
n61 -- n62;
n60 -- n61;
n59 -- n60;
n58 -- n59;
n57 -- n58;
n21 -- n57;
n20 -- n21;
n19 -- n20;
n18 -- n19;
n17 -- n18;
n16 -- n17;
n15 -- n16;
n14 -- n15;
n13 -- n14;
n3 -- n13;
n2 -- n3;
n1 -- n2;
}
However, I noticed that you are using this fragment in your code:
try {
sgrammar = new FileInputStream("/home/sharaf/antlr4/LatexParser/src/LG.g4");
sgrammarcontent = FileUtils.getStringFromStream(sgrammar);
}finally {}
I would suggest to catch the exception and (at least) print its in the catch block. Maybe it couldn't find the .g4.
file so that sgrammarcontent
is empty but since you don't catch the exception here, the program continues using an empty grammar.
I am getting the same output like yours. but you can observe in your output that equation elements are not appears in the tree. For example, x, =, -, b, \pm, .. are not in the output.
Plz find the attached code that is generating AST. AST.zip
Hi @sjiya, oh I see. Yes, per default, Terminals are not included in the parse tree. You basically have two possiblities:
1) Pass true
to the constructor of the default tree listener DefaultTreeListener t = new DefaultTreeListener(true);
so that terminals are getting included
or
2) Use the getEidx
, getSidx
members of the parse tree nodes (leafs) in order to retrieve the terminal from the string:
https://github.com/julianthome/inmemantlr/blob/23d1cb322a3500b286d9a2639efb2c044f2ea48f/inmemantlr-api/src/main/java/org/snt/inmemantlr/tree/ParseTreeNode.java#L240
When you go for option 1, you should get the following output `
graph {
node [fontname=Helvetica,fontsize=11];
edge [fontname=Helvetica,fontsize=10];
n1 [label="root",shape=ellipse];
n2 [label="math",shape=ellipse];
n3 [label="relation",shape=ellipse];
n4 [label="relation",shape=ellipse];
n5 [label="expr",shape=ellipse];
n6 [label="additive",shape=ellipse];
n7 [label="mp",shape=ellipse];
n8 [label="unary",shape=ellipse];
n9 [label="postfix",shape=ellipse];
n10 [label="exp",shape=ellipse];
n11 [label="comp",shape=ellipse];
n12 [label="atom",shape=ellipse];
n13 [label="x",shape=box];
n12 -- n13;
n11 -- n12;
n10 -- n11;
n9 -- n10;
n8 -- n9;
n7 -- n8;
n6 -- n7;
n5 -- n6;
n4 -- n5;
n3 -- n4;
n14 [label="=",shape=box];
n3 -- n14;
n15 [label="relation",shape=ellipse];
n16 [label="expr",shape=ellipse];
n17 [label="additive",shape=ellipse];
n18 [label="mp",shape=ellipse];
n19 [label="unary",shape=ellipse];
n20 [label="postfix",shape=ellipse];
n21 [label="exp",shape=ellipse];
n22 [label="comp",shape=ellipse];
n23 [label="frac",shape=ellipse];
n24 [label="\frac",shape=box];
n23 -- n24;
n25 [label="\{",shape=box];
n23 -- n25;
n26 [label="expr",shape=ellipse];
n27 [label="additive",shape=ellipse];
n28 [label="mp",shape=ellipse];
n29 [label="unary",shape=ellipse];
n30 [label="\-",shape=box];
n29 -- n30;
n31 [label="unary",shape=ellipse];
n32 [label="postfix",shape=ellipse];
n33 [label="exp",shape=ellipse];
n34 [label="comp",shape=ellipse];
n35 [label="atom",shape=ellipse];
n36 [label="b ",shape=box];
n35 -- n36;
n34 -- n35;
n33 -- n34;
n32 -- n33;
n31 -- n32;
n37 [label="postfix",shape=ellipse];
n38 [label="exp",shape=ellipse];
n39 [label="comp",shape=ellipse];
n40 [label="atom",shape=ellipse];
n41 [label="\pm",shape=box];
n40 -- n41;
n39 -- n40;
n38 -- n39;
n37 -- n38;
n31 -- n37;
n42 [label="postfix",shape=ellipse];
n43 [label="exp",shape=ellipse];
n44 [label="comp",shape=ellipse];
n45 [label="func",shape=ellipse];
n46 [label="\sqrt",shape=box];
n45 -- n46;
n47 [label="\{",shape=box];
n45 -- n47;
n48 [label="expr",shape=ellipse];
n49 [label="additive",shape=ellipse];
n50 [label="additive",shape=ellipse];
n51 [label="mp",shape=ellipse];
n52 [label="unary",shape=ellipse];
n53 [label="postfix",shape=ellipse];
n54 [label="exp",shape=ellipse];
n55 [label="exp",shape=ellipse];
n56 [label="comp",shape=ellipse];
n57 [label="atom",shape=ellipse];
n58 [label="b",shape=box];
n57 -- n58;
n56 -- n57;
n55 -- n56;
n54 -- n55;
n59 [label="\^",shape=box];
n54 -- n59;
n60 [label="atom",shape=ellipse];
n61 [label="2",shape=box];
n60 -- n61;
n54 -- n60;
n53 -- n54;
n52 -- n53;
n51 -- n52;
n50 -- n51;
n49 -- n50;
n62 [label="\-",shape=box];
n49 -- n62;
n63 [label="additive",shape=ellipse];
n64 [label="mp",shape=ellipse];
n65 [label="unary",shape=ellipse];
n66 [label="postfix",shape=ellipse];
n67 [label="exp",shape=ellipse];
n68 [label="comp",shape=ellipse];
n69 [label="atom",shape=ellipse];
n70 [label="4ac",shape=box];
n69 -- n70;
n68 -- n69;
n67 -- n68;
n66 -- n67;
n65 -- n66;
n64 -- n65;
n63 -- n64;
n49 -- n63;
n48 -- n49;
n45 -- n48;
n71 [label="\}",shape=box];
n45 -- n71;
n44 -- n45;
n43 -- n44;
n42 -- n43;
n31 -- n42;
n29 -- n31;
n28 -- n29;
n27 -- n28;
n26 -- n27;
n23 -- n26;
n72 [label="\}",shape=box];
n23 -- n72;
n73 [label="\{",shape=box];
n23 -- n73;
n74 [label="expr",shape=ellipse];
n75 [label="additive",shape=ellipse];
n76 [label="mp",shape=ellipse];
n77 [label="unary",shape=ellipse];
n78 [label="postfix",shape=ellipse];
n79 [label="exp",shape=ellipse];
n80 [label="comp",shape=ellipse];
n81 [label="atom",shape=ellipse];
n82 [label="2a",shape=box];
n81 -- n82;
n80 -- n81;
n79 -- n80;
n78 -- n79;
n77 -- n78;
n76 -- n77;
n75 -- n76;
n74 -- n75;
n23 -- n74;
n83 [label="\}",shape=box];
n23 -- n83;
n22 -- n23;
n21 -- n22;
n20 -- n21;
n19 -- n20;
n18 -- n19;
n17 -- n18;
n16 -- n17;
n15 -- n16;
n3 -- n15;
n2 -- n3;
n1 -- n2;
}
Thanks julian, option 1 is working fine and its getting result with equation elements. but I am bit confuse about option 2. how do I get equation element without chain of rules?? for example, I don't want following rules from n1-n12 in tree because the are use less in my case.
n1 [label="root",shape=ellipse]; n2 [label="math",shape=ellipse]; n3 [label="relation",shape=ellipse]; n4 [label="relation",shape=ellipse]; n5 [label="expr",shape=ellipse]; n6 [label="additive",shape=ellipse]; n7 [label="mp",shape=ellipse]; n8 [label="unary",shape=ellipse]; n9 [label="postfix",shape=ellipse]; n10 [label="exp",shape=ellipse]; n11 [label="comp",shape=ellipse]; n12 [label="atom",shape=ellipse]; n13 [label="x",shape=box];
The second option is interesting if you just want to extract the terminals from the input string. As an example you could do this with:
String quad = "x=\\frac{-b \\pm \\sqrt{b^2-4ac}}{2a}";
// ...
DefaultTreeListener t = new DefaultTreeListener();
// ...
cast.getLeafs().stream().forEach(
l -> System.out.println("\"" + quad.substring(l.getSidx(), l.getEidx()+1) + "\"")
);
This works because the leafs are always directly linked to a matching token. In case to simplify your parse tree a bit by filtering certain rule, you can actually pass a filter to the constructor of the DefaultTreeListener
(https://github.com/julianthome/inmemantlr/blob/23d1cb322a3500b286d9a2639efb2c044f2ea48f/inmemantlr-api/src/main/java/org/snt/inmemantlr/listener/DefaultTreeListener.java#L81). Only the nodes that match the filter rule you provide, will be added to the parse tree. I think you could pass a filter that excludes the rules you've mentioned in your last post. This is described in the section Parse Tree Pruning
in the documentation.
Dear Julian, Thank you for solving various issues. now everything is working great.😃
On Mon, Feb 11, 2019 at 9:59 PM Julian Thomé notifications@github.com wrote:
The second option is interesting if you just want to extract the terminals from the string. As an example you could do this with:
String quad = "x=\frac{-b \pm \sqrt{b^2-4ac}}{2a}";// ...DefaultTreeListener t = new DefaultTreeListener();// ... cast.getLeafs().stream().forEach( l -> System.out.println("\"" + quad.substring(l.getSidx(), l.getEidx()+1) + "\"") );
This works because the leafs are always directly linked to a matching token. In case to simplify your parse tree a bit by filtering certain rule, you can actually pass a filter to the constructor of the DefaultTreeListener ( https://github.com/julianthome/inmemantlr/blob/23d1cb322a3500b286d9a2639efb2c044f2ea48f/inmemantlr-api/src/main/java/org/snt/inmemantlr/listener/DefaultTreeListener.java#L81). Only the nodes that match the filter rule you provide, will be added to the parse tree. I think you could pass a filter that excludes the rules you've mentioned in your last post.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/julianthome/inmemantlr/issues/35#issuecomment-462406210, or mute the thread https://github.com/notifications/unsubscribe-auth/ATUL2jRnDqAMS1lbwfGY793dZhGVmp_Tks5vMaF5gaJpZM4XTxF2 .
I want to generate Abstract Syntax Tree (AST) of a mathematical equation from my MathGrammar.g4 antlr file. I want a AST without grammar tags in the tree only I want the equation components in the tree. suppose, I have an equation x^2+y^2=z^2, so I want tree in this style
= ---(+)---(^) -------------(x) -------------(2) ---------(^) -------------(y) -------------(2) ---(^) ------(z) ------(2) (=(+(^(x)(2))(^(y)(2)))(^(z)(2))) How can I do it in inmemantlr system?