Sefaria / Sefaria-Export

Structured Jewish texts and metadata exported from Sefaria's database.
Other
255 stars 165 forks source link

ERROR FOUND in Malachi, Lamentations, Ecclesiastes.json texts #12

Open TorahBibleCodes opened 5 years ago

TorahBibleCodes commented 5 years ago

Hi,

We are developing TorahBibleCodes Equidistant Letter Sequences (ELS) Search Software based upon the Tanach texts that you have provided: https://github.com/TorahBibleCodes/TorahBibleCodes

We have forked your texts, and have found an error because of unnecessary HTML tags included in the JSON Hebrew Text of Malachi on last line here:

https://github.com/Sefaria/Sefaria-Export/blob/master/json/Tanakh/Prophets/Malachi/Hebrew/Tanach%20with%20Text%20Only.json

Here is copy of the line: "והשיב לב־אבות על־בנים ולב בנים על־אבותם פן־אבוא והכיתי את־הארץ חרם<br><small>[הנה אנכי שלח לכם את אליה הנביא לפני בוא יום יהוה הגדול והנורא]</small>"

We will remove these on our local copy of your texts that are together with our program, but are concerned that those who choose to fork and/or clone your texts directly will encounter this error/bug when running our program using your texts until you are able to correct these errors.

Thank you.

TorahBibleCodes.com https://github.com/TorahBibleCodes

TorahBibleCodes commented 5 years ago

Upon checking the text, we see that you have added HTML tags to compensate for the verse that repeats in small letters. Perhaps there is a better way to do this that excludes HTML tags from the pure JSON such as just adding the verse that repeats as the last verse (even though it is not officially the last verse). We are going to have to remove this verse from our copy of the text since we need a program that includes only the official texts for ELS Search.

TorahBibleCodes commented 5 years ago

Same issue with the Book of Lamentations, the Book of Ecclesiastes, but strangely not with the Book of Isaiah which also contains the repeat of penultimate verse in small letters.