tfbf / Bible-Punjabi-Pavitr-Bible-1945

Bible-Punjabi-Pavitr-Bible-1945
Other
5 stars 9 forks source link

Why do some books have so few paragraph markers? #101

Open DavidHaslam opened 7 years ago

DavidHaslam commented 7 years ago

I have observed that some books (e.g. 1 Corinthians) still have almost no paragraphs - even after my recent TextPipe processing to ensure that every non-final verse ending with a double danda is followed by the tag \p

Why might this be?

Could it be that when the text was digitised, the transcribers took less than adequate notice of the end of verse punctuation marks? Did they sometimes key a vertical line where the PDF file has a double danda?

Did different volunteers work on separate books? Were some individual scribes more likely than others to make this kind of simple mistake?

Or do the PDF files of the 1945 Bible contain some books that don't feature paragraphs?

DavidHaslam commented 7 years ago

In contrast, 2 Corinthians does contain paragraphs.

This makes my questions all the more pressing.

DavidHaslam commented 7 years ago

I should tabulate the verse count and paragraph count for all 66 USFM files.

DavidHaslam commented 7 years ago

Done! Here are the results in an Excel worksheet.

Paragraphs & Verses Per Book.xlsx

The ratio of paragraphs to verses ranges from 2.39% (JHN) to 42.47% (JOL).

I have added a chart based on these results.

1CO is actually the second lowest ranked in this respect. Only JHN has a lower ratio of paragraphs.

DavidHaslam commented 7 years ago

The screenshot shows the P/V ratio per book with the books sorted in ratio order. screenshot 2017-01-20 19 03 55