allenai / taggers

Easily identify and label sentence intervals using various taggers.
Apache License 2.0
16 stars 2 forks source link

Refer to items in Kleene star #80

Open sbhaktha opened 10 years ago

sbhaktha commented 10 years ago

In the pattern language there is no way to refer to individual components tagged together with a Kleene star or plus.

Let's say I write a rule to identify a comma separated VP cluster like-

VPCluster := TypedOpenRegex {
   (?: (<FirstVP>:@VP (?: (?:<string=","> (<MoreVPs>: @VP))* <string=",">? <string=/and|or|and\/or/> (<LastVP>: @VP) )? )
}

If I have another pattern like

Pattern := TypedOpenRegex {
   (?: @NP (<Facts>: @VPCluster))
}

And in my output I want to be able to break down the VPCluster into individuals VPs and present the output as, say-

<NP> <VP1>
<NP> <VP2>
<NP> <VP3>
...

There is no way to refer to individual VPs tagged under . The only way to get around this is to assume a maximum reasonable number of VPs in a cluster and then tag each one separately as FirstVP, SecondVP, ThirdVP and so on.

It would be convenient if we could say something like ${x.Facts->VPCluster.MoreVPs[0]}, ${x.Facts->VPCluster.MoreVPs[1]} etc.

schmmd commented 10 years ago

Could you give me a definition of VBCluster that has MoreVPs in it? Also, it'd be helpful if you'd use backticks around code. Especially since you are actually tagging GitHub users when you refer to VP and NP and they are getting emails.

sbhaktha commented 10 years ago

OMG!! I tagged people haha.

sbhaktha commented 10 years ago

@schmmd Here is an e.g. of a VPCluster with MoreVPs: Yesterday I rode the bus, went to the library, check out some books and rode back home.

'rode the bus' would be FirstVP, 'rode back home' LastVP and 'went to the library' and 'check out some books' would be MoreVPs.

Does this help? You asked for "definition" of VPCluster that has MoreVPs- did you mean example? Let me know if you need more info.