Sefaria / Sefaria-Export

Structured Jewish texts and metadata exported from Sefaria's database.
Other
245 stars 161 forks source link

How to map references in links to specific text fragment? #38

Open rudolfovic opened 1 year ago

rudolfovic commented 1 year ago

The link files contain citations in the following format:

"A New Israeli Commentary on Pirkei Avot 1:10:13" -> "Sanhedrin 4a"

The table also contains book names and these are easy to identify and find in the repo.

However, it is not clear how to easily find the fragments referred to in the citation columns without writing custom parsers.

For example, Sanhedrin 4a - there isn't anything (like an index) in the json (the same applies to the other formats) structure of the Sanhedrin file to find the text extract itself.

Moreover, even if I were to write custom parsers for these references, they only point to the beginning of the extract and not the end.

On the other hand, the Sefaria application successfully maps all references (and the websites too obviously). What am I missing?