pombreda / txt2tags

Automatically exported from code.google.com/p/txt2tags
GNU General Public License v2.0
0 stars 0 forks source link

v3: Consider using xml.etree.ElementTree #101

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
http://docs.python.org/library/xml.etree.elementtree.html
http://effbot.org/zone/element.htm

This is a nice library, new in Python 2.5, to save data into a tree, similar to 
DOM.

It would be nice if the txt2tags processing evolute from the current:

- read source line by line
- for each line, match blocks
- if matched:
  - inside the block lines, match inline marks
  - escape everything
  - dump tagged block
- repeat until EOF

to:

- read source line by line (or char by char)
- match blocks and save to etree
- inside the block lines, match inline marks and save to etree
- repeat until EOF
- close input file, we won't need it anymore
- now scan the etree from the start to do the escaping and tagging

In other words, separate the two process that today are mixed together:
- read, parse and close input file
- convert data to target (escape and add tags)

Than, in the middle, we can dump a XML representation of the etree to spot bug 
and find exactly how txt2tags parsed the source file. Each target could be a 
class that just knows about the etree, not txt2tags markup.

Ideas and suggestion are very welcome.

Original issue reported on code.google.com by aureliojargas@gmail.com on 10 Dec 2010 at 11:16

GoogleCodeExporter commented 9 years ago
Jason mentioned python-markdown as a Python app similar to us that uses etree.

http://www.freewisdom.org/projects/python-markdown/

There's also a fork/rewrite which claims it's cleaner here:
http://code.google.com/p/python-markdown2/

Original comment by aureliojargas@gmail.com on 15 Dec 2010 at 4:29