Closed michnov closed 9 years ago
Yes, they're not because these are basically completely different nodes and it's not even guaranteed that there will be the same amount of them as before the parsing. During Alpino parsing, the whole tree is basically re-built again from the Alpino output.
This will not be an easy fix. Is this crucial in any way? Can't you first parse and then set the wild attributes?
PS: Actually I forgot that it's even more complicated – the Alpino parse is loaded into a p-tree, which is then converted into an a-tree (that replaces the previous, flat a-tree).
Treex::Tool::PhraseParser::Alpino is stored in file https://github.com/ufal/treex/blob/08549e4210c0432b32f7375ff9e13a0974168bc3/lib/Treex/Tool/Alpino/Parser.pm#L1 Is it on purpose?
I am not sure if the interface of this module is optimal ($parser->parse_zones($zones_rf)
), but let's say it os OK (or legacy).
Maybe it could try to copy any wild attributes from the a-nodes to the newly created p-nodes on best-effort basis (if there is not 1-to-1 correspondence between the a-nodes and p-node terminals).
Oops – the package name is rather a mistake.... it should be Treex::Tool::Alpino::Parser
. I will fix this (if you don't have a different view).
The interface of this module is adapted from some other parser, I do not remember which one anymore. What would be the optimal way?
It definitely could try to copy the wild attributes – this is the "non-easy" fix ;-).
It's needed for gazetteers, particularly for W2A::GazeteerMatch, which in its current implementation writes into wild attributes and must be run just after tokenization, definitely before parsing (https://github.com/ufal/treex/blob/gazeteer/lib/Treex/Scen/Analysis/NL.pm#L31)
Copying wild attributes from p-tree to a newly established a-tree is easy and already sorted (https://github.com/ufal/treex/blob/master/lib/Treex/Block/P2A/NL/Alpino.pm#L205). The problem is, how to copy the attributes from the old a-tree to a p-tree created by Alpino (https://github.com/ufal/treex/blob/master/lib/Treex/Tool/Alpino/Parser.pm)
returns