bountonw / translate

Other
4 stars 1 forks source link

PDFs for Lao books #430

Open mattleff opened 1 day ago

mattleff commented 1 day ago

We need to find a PDF generation solution for Lao books. This uses the Paged.js library to generate a PDF using web technologies. We need to figure out a number of issues, mostly importantly how to handle line breaking consistently and automatically.

@bountonw Can you take a look at this PDF and let me know if:

  1. There are lines broken that should not be broken;
  2. There are lines that should be breaking that are not; or
  3. There are other critical content issues (incorrect characters, etc.).

GC01_lo.pdf

Obviously we need to do a bunch to get this PDF equivalent to the existing Thai draft PDFs, but I'm wanting to validate that we can get the content flowing correctly before investing a lot of time into the other features.

bountonw commented 17 hours ago

@mattleff Wow. Thank you!!! I'm in Laos. Just finished my translating goal for the week. Tomorrow morning (Friday) will document the issues in detail.

bountonw commented 4 hours ago

GC01_lo_commented.pdf

@mattleff Thank you so much!

I've commented on the first 6 pages. Many of my comments are not related to the question at hand--viability. The last word of many pages is split. These are simple words like ของ แต่ กล่าว คน. There was one split on a complicated word, but this could be fixed with a hyphen if hyphenation is an option. This is a game stopper. Fatal. I have included the Thai equivalents to the words in question so that you can get an idea of what is happening. Hopefully this issue can be fixed, if not, we will need a different solution. The rest of the points are moot until this point is fixed.

There are ragged edges. They could be fixed with a combination of hyphenation and with more frequent phrasal spaces. This should also help with the very large mid-line spaces.

Thessalonians is split on the last page. As I didn't go into detail after the first 6 pages, there may be other places where there are word splits on words not occurring as the last word on the page. Thessalonians is not a standard Lao word, which also may be a reason why it was split. It was split at a proper syllable, unlike the problem mentioned above.