Open gabriel-weaver opened 12 years ago
Lightweight Structure In Text.
Pattern matching is heavily used for searching, filtering, and transforming text, but existing pattern languages offer few opportunities for reuse. Lightweight structure is a new approach that solves the reuse problem. Lightweight structure has three parts: a model of text structure as contiguous segments of text, or regions; an extensible library of structure abstractions (e.g., HTML elements, Java expressions, or English sentences) that can be implemented by any kind of pattern or parser; and a region algebra for composing and reusing structure abstractions. Lightweight structure does for text pattern matching what procedure abstraction does for programming, enabling construction of a reusable library.
Lightweight structure has been implemented in LAPIS, a web browser/text editor that demonstrates several novel techniques:
Coccinelle
Coccinelle: A program matching and transformation tool for systems code, 2011. Retrieved November 11, 2011 from http://coccinelle.lip6.fr/.
TXR: a Pattern Matching Language (Not Just) for Convenient Text Extraction
Suggested by Kaz Kylheku on Slashdot.