r1chardj0n3s / parse

Parse strings using a specification based on the Python format() syntax.
http://pypi.python.org/pypi/parse
MIT License
1.7k stars 100 forks source link

No empty {} placeholders #138

Open dejudicibus opened 2 years ago

dejudicibus commented 2 years ago

Let us suppose that I write

pattern = ">{}<"
html = "Mr. <strong>George</strong> is a <em>beautiful</em> cat"
print(''.join(part[0] for part in findall(pattern, html)))

I get

Mr. George is a beautiful

To fix that I can write

pattern = ">{}<"
html = "<p>Mr. <strong>George</strong> is a <em>beautiful</em> cat</p>"
print(''.join(part[0] for part in findall(pattern, html)))

and now I get

Mr. George is a beautiful cat

However, if I write

pattern = ">{}<"
html = "<p><strong>George</strong> is a <em>beautiful</em> cat</p>"
print(''.join(part[0] for part in findall(pattern, html)))

I get

<strong>George is a beautiful cat

It looks like the {} placeholder does not like to be empty.

sdementen commented 2 years ago

Would this achieve what you want ?

pattern = ">{:opt}<"
html = "<p><strong>George</strong> is a <em>beautiful</em> cat</p>"
print(''.join(part[0] for part in findall(pattern, html,extra_types=dict(opt=with_pattern(r'.*?')(lambda x:x)))))

outputs George is a beautiful cat