Closed mikix closed 1 year ago
Oh I see - I think this is just an indentation setting issue. Extended (the default) yields the result I saw. Standard/strict do not.
I guess I'll close this as user-misunderstanding. Thanks!
just in case someone wonders how to implement the strict setting for the example above:
from inscriptis import get_text
from inscriptis.css_profiles import CSS_PROFILES
from inscriptis.model.config import ParserConfig
config = ParserConfig(css=CSS_PROFILES['strict'].copy())
text = get_text('fi<span>r</span>st', config)
print(text)
I would expect those two examples to match (and look like the
<b>
example, where it's one word, as that's what a browser shows).Inscriptis seems to work really well though! Thanks for this software.