Sefaria / Sefaria-Export

Structured Jewish texts and metadata exported from Sefaria's database.
Other
245 stars 162 forks source link

Path collision: Case-sensitive files on a case-insensitive filesystem (APFS) #15

Closed meangrape closed 4 years ago

meangrape commented 4 years ago

There are a handful of JSON files that have 2 versions–one with an uppercased letter and one without. Is this intentional?

The macos APFS filesystem (while being capable of case-sensitivity is almost always configured to be case-insensitive). sefer is a popular collision.


on a case-insensitive filesystem) and only one from the same
colliding group is in the working tree:

  'cltk-flat/Halakhah/Sefer HaChinukh/Hebrew/*Sefer* HaChinukh -- Torat Emet.json'
  'cltk-flat/Halakhah/Sefer HaChinukh/Hebrew/*sefer* HaChinukh -- Torat Emet.json'
  'cltk-flat/Talmud/Bavli/Commentary/Chidushei Halachot/Seder Moed/Chidushei Halachot on Taanit/Hebrew/Vilna Edition.json'
  'cltk-flat/Talmud/Bavli/Commentary/Chidushei Halachot/Seder Moed/Chidushei Halachot on Taanit/Hebrew/Vilna edition.json'
  'cltk-flat/Tanakh/Commentary/Malbim/Prophets/Malbim on Jonah/Hebrew/Malbim On Jonah--Wikisource.json'
  'cltk-flat/Tanakh/Commentary/Malbim/Prophets/Malbim on Jonah/Hebrew/Malbim on Jonah--Wikisource.json'
  'cltk-full/Halakhah/Sefer HaChinukh/Hebrew/Sefer HaChinukh -- Torat Emet.json'
  'cltk-full/Halakhah/Sefer HaChinukh/Hebrew/sefer HaChinukh -- Torat Emet.json'
  'cltk-full/Talmud/Bavli/Commentary/Chidushei Halachot/Seder Moed/Chidushei Halachot on Taanit/Hebrew/Vilna Edition.json'
  'cltk-full/Talmud/Bavli/Commentary/Chidushei Halachot/Seder Moed/Chidushei Halachot on Taanit/Hebrew/Vilna edition.json'
  'cltk-full/Tanakh/Commentary/Malbim/Prophets/Malbim on Jonah/Hebrew/Malbim On Jonah--Wikisource.json'
  'cltk-full/Tanakh/Commentary/Malbim/Prophets/Malbim on Jonah/Hebrew/Malbim on Jonah--Wikisource.json'
  'json/Halakhah/Sefer HaChinukh/Hebrew/Sefer HaChinukh -- Torat Emet.json'
  'json/Halakhah/Sefer HaChinukh/Hebrew/sefer HaChinukh -- Torat Emet.json'
  'json/Talmud/Bavli/Commentary/Chidushei Halachot/Seder Moed/Chidushei Halachot on Taanit/Hebrew/Vilna Edition.json'
  'json/Talmud/Bavli/Commentary/Chidushei Halachot/Seder Moed/Chidushei Halachot on Taanit/Hebrew/Vilna edition.json'
  'json/Tanakh/Commentary/Malbim/Prophets/Malbim on Jonah/Hebrew/Malbim On Jonah--Wikisource.json'
  'json/Tanakh/Commentary/Malbim/Prophets/Malbim on Jonah/Hebrew/Malbim on Jonah--Wikisource.json'
  'txt/Halakhah/Sefer HaChinukh/Hebrew/Sefer HaChinukh -- Torat Emet.txt'
  'txt/Halakhah/Sefer HaChinukh/Hebrew/sefer HaChinukh -- Torat Emet.txt'
  'txt/Talmud/Bavli/Commentary/Chidushei Halachot/Seder Moed/Chidushei Halachot on Taanit/Hebrew/Vilna Edition.txt'
  'txt/Talmud/Bavli/Commentary/Chidushei Halachot/Seder Moed/Chidushei Halachot on Taanit/Hebrew/Vilna edition.txt'
  'txt/Tanakh/Commentary/Malbim/Prophets/Malbim on Jonah/Hebrew/Malbim On Jonah--Wikisource.txt'
  'txt/Tanakh/Commentary/Malbim/Prophets/Malbim on Jonah/Hebrew/Malbim on Jonah--Wikisource.txt'```
meangrape commented 4 years ago

I'm assuming the one with about 4 lines of JSON from 3 years ago (the lower case file) would be the one to go instead of the 2MB file from 5 months ago. If I have time tonight, I'll generate PRs for these.

'cltk-flat/Halakhah/Sefer HaChinukh/Hebrew/*Sefer* HaChinukh -- Torat Emet.json' 5 months old

'cltk-flat/Halakhah/Sefer HaChinukh/Hebrew/*sefer* HaChinukh -- Torat Emet.json' 3 years old

nsantacruz commented 4 years ago

Thank you for pointing this out. This repo is auto-generated using a script in the Sefaria-Project repo so a pull request here wouldn't make sense. These seem to be issues with our data, not our code. I will pass this along to the head of content.