leondz / cavat

Automatically exported from code.google.com/p/cavat
3 stars 1 forks source link

Sentence mis-alignment on import #62

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Any import. For example,

 Not that long ago, <SIGNAL sid="s13">before</SIGNAL> the Chinese <EVENT eid="e12" class="OCCURRENCE">takeover</EVENT>, the <EVENT eid="e14" class="OCCURRENCE">news</EVENT> about real estate
 here was that the sky was the limit the highest prices in the world. So
 <SIGNAL sid="s15">when</SIGNAL> Wong Kwan
 <EVENT eid="e16" class="OCCURRENCE">spent</EVENT> seventy million dollars for this house, he <EVENT eid="e17" class="I_STATE">thought</EVENT> it was a
 great <EVENT eid="e18" class="OCCURRENCE">deal</EVENT>. 

lists s13 as the 8th token in its sentence, and s15 as the first word in the 
next sentence.

Original issue reported on code.google.com by leonderczynski on 21 Jun 2010 at 2:48

GoogleCodeExporter commented 9 years ago
Ended at sentence 7 word 18
|. Here's ABC's Jim Laurie.|
Ended at sentence 9 word 4
|
|
Ended at sentence 9 word 4
|
|
Ended at sentence 9 word 4
|
|
Ended at sentence 9 word 4
| Not that long ago, |
Ended at sentence 9 word 8

We in fact ended at s9 w0 on the top like - the length of s8 is mistakenly 
included in s9.

Original comment by leonderczynski on 21 Jun 2010 at 2:59

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r85.

Original comment by leonderczynski on 21 Jun 2010 at 3:06

GoogleCodeExporter commented 9 years ago
Now correctly reports, sentence and word offsets are from 0.

Original comment by leonderczynski on 21 Jun 2010 at 3:06

GoogleCodeExporter commented 9 years ago
This issue was closed by revision 7e18e09145ce.

Original comment by leonderczynski on 19 Jun 2011 at 6:05