Closed simonw closed 2 years ago
I ran this against columns containing HTML and got results like this:
Add an option to --strip-tags before sending text to the API. It can use a simple regular expression.
--strip-tags
tag_re = re.compile("<.*?>") def strip_tags(s): return tag_re.sub("", s)
Or I might borrow strip_tags() from Django, since it's better tested against more cases:
strip_tags()
https://github.com/django/django/blob/f8f16b3cd85599b464cbc5c7e884387940c24e6f/django/utils/html.py#L141-L182
I ran this against columns containing HTML and got results like this:
Add an option to
--strip-tags
before sending text to the API. It can use a simple regular expression.