mariarahat / bungeni-editor

Automatically exported from code.google.com/p/bungeni-editor
2 stars 0 forks source link

Extract track changes #60

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
The current method of extracting track changes using the UNO api is slow and 
expensive. Scalabiity 
needs to be tested for a large number of documents.

To Do --

investigate alternatives -- 

1) XSLT
2) Using ODFDOM

Original issue reported on code.google.com by ashok.ha...@gmail.com on 8 Feb 2010 at 12:37

GoogleCodeExporter commented 9 years ago
ODFDOM 0.7.5 - issue with loading dom 
<http://odftoolkit.org/forums/ODFDOM/topics/91-Error-loading-document>

Original comment by ashok.ha...@gmail.com on 10 Feb 2010 at 6:41

GoogleCodeExporter commented 9 years ago
Re: comment #1 - attempting with trunk build of odfdom

Original comment by ashok.ha...@gmail.com on 10 Feb 2010 at 6:46

GoogleCodeExporter commented 9 years ago
trunk ODFDOM uses ODF 1.2 style prefix+namespace NS checking. Fixed setting of
prefix+namespace in Bungeni Editor (see Issue 60)

Original comment by ashok.ha...@gmail.com on 10 Feb 2010 at 2:42

GoogleCodeExporter commented 9 years ago
Switch to odfdom trunk from odfdom 0.7.5 .

Trunk version (rev 34) supports accessing the metadata as a dom via 
getMetaDom() 
instead of more complex external parsing.

TO DO:

Extract track changes via ODFDOM

Original comment by ashok.ha...@gmail.com on 11 Feb 2010 at 1:28

GoogleCodeExporter commented 9 years ago
How to get inserted text from  change markings ?

insert change markings use separate closures linked by an id -- 

<text:change-start id="xyz" />
<text:change-end id="xyz" />

these are placed arbitrarily in the odf hierarchy based on where the change 
occured.

Original comment by ashok.ha...@gmail.com on 17 Feb 2010 at 2:26

GoogleCodeExporter commented 9 years ago
The following appears to match inserted text parts ... at least for a couple of 
test 
case documents .. 

//text:change-start[@text:change-id='ct472232592']/following::*[@text:change-
id='ct472232592'][1]/following::text()

to do :

test with more document examples

Original comment by ashok.ha...@gmail.com on 18 Feb 2010 at 7:28

GoogleCodeExporter commented 9 years ago
After testing this the correct expression : 

//text:change-start[@text:change-id='ct-1413048760']/following::text() except  
//text:change-start[@text:change-id='ct-1413048760']/following::*[@text:change-
id='ct-1413048760'][1]/following::text()

Original comment by ashok.ha...@gmail.com on 18 Feb 2010 at 8:34

GoogleCodeExporter commented 9 years ago
Solution in comment #7 is XPath 2 syntax .. in XPath 1, the following works .. 

//text:change-start[@text:change-
id='ct716683728']/following::text()[not(preceding::text:change-end[@text:change-
id='ct716683728'])]

Original comment by ashok.ha...@gmail.com on 18 Feb 2010 at 1:32

GoogleCodeExporter commented 9 years ago

Original comment by ashok.ha...@gmail.com on 25 Feb 2010 at 9:27