simonw / strip-tags

CLI tool for stripping tags from HTML
Apache License 2.0
209 stars 6 forks source link

Option to truncate output #2

Closed simonw closed 1 year ago

simonw commented 1 year ago

I want to use this tool to pipe content into llm - so I'd like to be able to truncate the output in order to stay within the token limit.

simonw commented 1 year ago

There are three ways I could do this:

The "tokens" one would be most useful for working with LLMs, but for which tokenizer? Maybe that should be a separate tool.

simonw commented 1 year ago

On that basis, I'm not going to do this with this tool - I'll build something else I can pipe through instead.