bskinn / pent

pent Extracts Numerical Text -- Mini-language driven parser for structured numerical data in text
MIT License
20 stars 3 forks source link

Brainstorming OneOrMore and ZeroOrMore tokens #27

Closed bskinn closed 6 years ago

bskinn commented 6 years ago

Initially implement with required whitespace between the regex repeats. Whitespace-only makes definite sense for numerics -- may be less valuable for strings, where, e.g., headers like ---===---===---=== or some crazy thing might be in use. Probably an 'any' capture would suffice for most instances of this, though... e.g., @x!.---=== ~! ... since it seems unlikely that granular capture would be needed at the point where pent would be involved.

Have to have the returned chunk NOT have word boundaries at start or end, though, since a preceding 'no-space' token would want to run-on; also, no-space should be a valid option for these quantity modes (OneOrMore, at least), since a run of multiple values could easily be followed immediately by some enclosure or other punctuation.

SEE ALSO #24