rmusser01 / tldw

tl/dw (Too Long, Didn't Watch): Your Personal Research Multi-Tool - a naive attempt at 'A Young Lady's Illustrated Primer'
Apache License 2.0
330 stars 11 forks source link

Enhancement: Add user-defined time-based chunking for summarization #33

Closed rmusser01 closed 5 months ago

rmusser01 commented 5 months ago

As a user, I would like to be able to specify timeblocks, which are cut out of the transcription, and then summarized. These summaries are then strung together in one combined summary, or re-summarized together, either recursively, or as input to one prompt.

When using the CLI, I should be able to pass an argument so that summarization will occur based on the token count of the transcription, and not based on the entirety of the original transcription. - CLI arg: '--chunk-time' / '-ctime' The resulting 'time chunks' should be user definable and determined through the following:

'--time-count' / '-tc'  -  Time count of chunks (how much time to split chunks into)

If the '--chunk-time' / '-ctime' arguments are passed, but the '--time-count' argument is not, then a default assumption of X(TBD, sliding scale based on original video size with a base time amount?) time is assumed, and used instead.