LuteOrg / lute-v3

LUTE = Learning Using Texts: learn languages through reading. Python/Flask.
MIT License
423 stars 46 forks source link

Words per page ideas: a) parse text length before import; b) target number of pages #470

Open jamesdeluk opened 1 month ago

jamesdeluk commented 1 month ago

Currently you set the number of words per page.

Two ideas how to improve this:

a) Parse the number of words in the Text box or file before having to create the book. This overcomes the annoyance of making a two-page book with 250 words on page 1 and 1 word on page 2. To avoid slowing the application, perhaps have it as an optional button on the New Book page.

b) The ability to set a target number of pages instead of words. If I have a 1500 word document and I say I want 3 pages, it automatically calculates 1500/3 = 500 words per page.

jzohrab commented 1 month ago

Hi, thanks for the note.

An aside: Lute actually splits pages a bit differently: it splits pages by sentence, getting as close to the token count (word count) as possible. You'll never have a page with just one word, but you might have a page with just one sentence. So your question still stands.

There's an existing issue, https://github.com/LuteOrg/lute-v3/issues/356, which would address issue a) well, I think. Let me know if that works for you.

Target number of pages -- I think we should skip this idea because it's too easy to get wrong -- e.g. I import a 100 page book and I say I want it to be 10 pages long. That could get nasty, kill performance etc.

@jamesdeluk LMK if #356 covers your need, and if you're ok with closing this issue. Cheers!

jamesdeluk commented 1 month ago

Lute actually splits pages a bit differently: it splits pages by sentence, getting as close to the token count (word count) as possible. You'll never have a page with just one word, but you might have a page with just one sentence. So your question still stands.

That makes sense! I probably should have realised that already.

There's an existing issue, https://github.com/LuteOrg/lute-v3/issues/356, which would address issue a) well, I think. Let me know if that works for you.

I feel that's slightly different; my understanding is it stops short final pages, so if I have the max at 200 words, then the last page would be up to 280 words (200 + 40%)

My idea is:

  1. Fill the Text box, or attach a document
  2. [Optional] Click a 'word count' button, which tells the word count
  3. Set words per page or total pages

For example, if the word count is 700, I could set 350 words per page (or 2 pages). If it was 800, I could choose 200 (or 4 pages).

There could still be an upper limit if using total pages, i.e. if it was 20,000 (i.e. a book), and the user said 1 page, you could alert the user, for example, "minimum number of pages for this text is 10" (i.e. 2000/page), to overcome the performance issues.

Personally, I'd prefer the 'word count' function more than 'total pages', because I can manually calculate the latter from the former pretty easily, and it avoids the concerns you have about one-page novels.

Here's a userscript I (i.e. an LLM) built that adds the functionality: https://greasyfork.org/en/scripts/505086-lute-word-count

However, it only works for the textarea, not a file.