Open zezke opened 8 years ago
Thanks for the report! I believe this is the same as the original version of Egret (https://sites.google.com/site/zhangh1982/egret), so please check with the original author.
On Fri, Oct 2, 2015 at 4:25 AM, Bram Vandewalle notifications@github.com wrote:
When I compile egret and run the following command:
$ ./egret -lapcfg -i=testeng.txt -data=eng_grammar
I get this output:
( (NP^g (NP^g (NN story)) (PP^g (IN of) (NP^g (NN man))))) ( (NP^g (NP^g (DT the) (NN story)) (PP^g (IN of) (NP^g (NN man))))) ( (S^g (NP^g (NP^g (DT the) (NN story)) (PP^g (IN of) (NP^g (NN man)))) (VP^g (VBZ bites) (NP^g (NN dog))))) ( (S^g (NP^g (NP^g (DT the) (NN man)) (PP^g (IN of) (NP^g (NP^g (DT the) (NN story)) (PP^g (IN of) (NP^g (NN man)))))) (VP^g (VBZ bites) (NP^g (NN dog))))) ( (S^g (NP^g (DT the) (NN man)) (VP^g (@VP^g (VBZ bites) (NP^g (NP^g (DT the) (NN story)) (PP^g (IN of) (NP^g (NN man))))) (PP^g (IN like) (NP^g (NN dog)))))) ( (S^g (NP^g (DT the) (NN dog)) (VP^g (VBZ bites) (NP^g (NP^g (DT the) (NN bone)) (PP^g (IN of) (NP^g (DT a) (NN man))))))) ( (NP^g (NP^g (DT the) (NN dog)) (PP^g (IN like) (NP^g (NP^g (DT the) (NN bone)) (PP^g (IN of) (NP^g (DT a) (NN man))))))) ( (NP^g (NP^g (DT the) (NN man)) (PP^g (IN like) (NP^g (DT a) (NN dog))))) ( (S^g (NP^g (DT the) (NN man)) (VP^g (VBZ bites) (NP^g (DT a) (NN dog))))) ( (S^g (NP^g (DT a) (NN man)) (VP^g (@VP^g (VBZ gives) (NP^g (DT the) (NN dog))) (NP^g (DT a) (NN bone))))) all time:18.6939s rule loading time:17.2935s
init binary rule time:0.155634s init unary rule time:0.079887s
middle binary rule time:0.256627s middle unary rule time0.384605s
final binary rule time:0.036625s final unary rule time0.160108s
set unary node time:0s query time:0s inside binart rule ti
As you can see there are extra @ and ^g characters in the phrase structure output. I believe this originates from the grammar files. An excerpt to show what I mean:
VP^g_0 -> VBP_0 ADVP^g_0 8.071531778304136E-4 VP^g_0 -> VBP_0 FRAG^g_0 6.974824753857925E-6 VP^g_0 -> VBP_0 NP^g_0 0.0129917721169252 VP^g_0 -> VBP_0 PP^g_0 0.0038893146592574646 VP^g_0 -> VBP_0 PRN^g_0 1.3731370047197033E-5 VP^g_0 -> VBP_0 PRT^g_0 1.3078595240373827E-4 VP^g_0 -> VBP_0 RB_0 1.4833211062545537E-4 VP^g_0 -> VBP_0 SBAR^g_0 0.007899304990428138 VP^g_0 -> VBP_0 SINV^g_0 6.673484317759013E-6 VP^g_0 -> VBP_0 S^g_0 0.005709293376162579 VP^g_0 -> VBP_0 UCP^g_0 1.0074575609880492E-4 VP^g_0 -> VBP_0 VP^g_0 0.02150708714627632 VP^g_0 -> VBZ_0 0.015932094778774906
This file already contains the extra characters. Could this be an error in the uploaded fiiles?
— Reply to this email directly or view it on GitHub https://github.com/neubig/egret/issues/2.
I've contacted the original author, I will update this issue if any progress is made.
When I compile egret and run the following command:
I get this output:
As you can see there are extra @ and ^g characters in the phrase structure output. I believe this originates from the English grammar files. An excerpt to show what I mean:
This file already contains the extra characters. Could this be an error in the uploaded fiiles?