nix-community / all-cabal-json

A repository containing all of the cabal files for all public Haskell packages converted to json [maintainer=@DavHau]
MIT License
3 stars 4 forks source link

The `hackage` branch doesn't have a "stable" narHash due to case conflicts #4

Open NobbZ opened 1 year ago

NobbZ commented 1 year ago

On a regular linux based system there are 536063 files in this repository.

A mac or windows, which usually "ignores" casing, this repo only contains 535855 in the worktree.

The following was used on a x86_64-linux NixOS to confirm my suspicion after I had problems using D2N on a Mac due to narHash mismatch.

$ find . | awk '{print tolower($0)}' | sort | wc -l
536063
$ find . | awk '{print tolower($0)}' | sort -u | wc -l
535855

This leads us to a situation where we can't use fetchFromGitHub without alternating hashes programmatically. And it is hard to get all required hashes when one doesn't have access to linux and MacOS.

We can not use this as a flake input at all if the flake shall support Linux and Mac, eg. in dream 2 nix.

I have not checked the other branches. Nor did I check which files actually get lost and if this is valuable information getting lost or insignificant.

NobbZ commented 1 year ago

Addendum:

The following list shows the normalized names of conflicting candidates:

$ find . | awk '{print tolower($0)}' | sort | uniq -D | uniq
./basic
./basic/0.1.0.0
./basic/0.1.0.0/basic.cabal
./basic/0.1.0.0/basic.hashes.json
./basic/0.1.0.0/basic.json
./berkeleydb
./buster
./cabal
./cassava
./cassava/0.5.1.0
./cassava/0.5.1.0/cassava.cabal
./cassava/0.5.1.0/cassava.hashes.json
./cassava/0.5.1.0/cassava.json
./checked
./cli
./command
./compactable
./compactable/0.1.0.0
./compactable/0.1.0.0/compactable.cabal
./compactable/0.1.0.0/compactable.hashes.json
./compactable/0.1.0.0/compactable.json
./compactable/0.1.0.1
./compactable/0.1.0.1/compactable.cabal
./compactable/0.1.0.1/compactable.hashes.json
./compactable/0.1.0.1/compactable.json
./compactable/0.1.0.2
./compactable/0.1.0.2/compactable.cabal
./compactable/0.1.0.2/compactable.hashes.json
./compactable/0.1.0.2/compactable.json
./condor
./condor/0.3
./condor/0.3/condor.cabal
./condor/0.3/condor.hashes.json
./condor/0.3/condor.json
./dao
./data-rope
./dbus
./diff
./digit
./doctest
./empty
./eq
./extra
./facts
./filemanip
./filepather
./fin
./focus
./focus/0.1.1
./focus/0.1.1/focus.cabal
./focus/0.1.1/focus.hashes.json
./focus/0.1.1/focus.json
./focus/0.1.2
./focus/0.1.2/focus.cabal
./focus/0.1.2/focus.hashes.json
./focus/0.1.2/focus.json
./geodetic
./gist
./githud
./hangman
./hdbc-postgresql-hstore
./hermes
./hlist
./hlogger
./hmm
./hmm/0.2.1
./hmm/0.2.1/hmm.cabal
./hmm/0.2.1/hmm.hashes.json
./hmm/0.2.1/hmm.json
./hpath
./hricket
./hset
./hset/0.0.1
./hset/0.0.1/hset.cabal
./hset/0.0.1/hset.hashes.json
./hset/0.0.1/hset.json
./hydrogen
./indentparser
./indentparser/0.1
./indentparser/0.1/indentparser.cabal
./indentparser/0.1/indentparser.hashes.json
./indentparser/0.1/indentparser.json
./interpolation
./interpolation/0.1
./interpolation/0.1/interpolation.cabal
./interpolation/0.1/interpolation.hashes.json
./interpolation/0.1/interpolation.json
./irc
./jackminimix
./jackminimix/0.1
./jackminimix/0.1/jackminimix.cabal
./jackminimix/0.1/jackminimix.hashes.json
./jackminimix/0.1/jackminimix.json
./javasf
./javav
./kalman
./kyotocabinet
./kyotocabinet/0.1
./kyotocabinet/0.1/kyotocabinet.cabal
./kyotocabinet/0.1/kyotocabinet.hashes.json
./kyotocabinet/0.1/kyotocabinet.json
./lattices
./mecha
./mechs
./mechs/0.0.0.0
./mechs/0.0.0.0/mechs.cabal
./mechs/0.0.0.0/mechs.hashes.json
./mechs/0.0.0.0/mechs.json
./metrics
./modulo
./moe
./naperian
./noise
./nomyx-core
./nomyx-language
./nomyx-web
./numbers
./omega
./only
./peano
./perfecthash
./plural
./plural/0.0.1
./plural/0.0.1/plural.cabal
./plural/0.0.1/plural.hashes.json
./plural/0.0.1/plural.json
./plural/0.0.2
./plural/0.0.2/plural.cabal
./plural/0.0.2/plural.hashes.json
./plural/0.0.2/plural.json
./quickson
./range
./range/0.1.0.0
./range/0.1.0.0/range.cabal
./range/0.1.0.0/range.hashes.json
./range/0.1.0.0/range.json
./ref
./ref/0.1.0.0
./ref/0.1.0.0/ref.cabal
./ref/0.1.0.0/ref.hashes.json
./ref/0.1.0.0/ref.json
./rfc1751
./safe
./scalendar
./scalendar/1.0.0
./scalendar/1.0.0/scalendar.cabal
./scalendar/1.0.0/scalendar.hashes.json
./scalendar/1.0.0/scalendar.json
./scalendar/1.1.0
./scalendar/1.1.0/scalendar.cabal
./scalendar/1.1.0/scalendar.hashes.json
./scalendar/1.1.0/scalendar.json
./sdl2-ttf
./seqalign
./smtlib
./stack
./stream
./svg2q
./tables
./tensor
./thrift
./thrift/0.6.0
./thrift/0.6.0/thrift.cabal
./thrift/0.6.0/thrift.hashes.json
./thrift/0.6.0/thrift.json
./tic-tac-toe
./top
./unique
./validation
./vec
./vulkan
./vulkan/0.1.0.0
./vulkan/0.1.0.0/vulkan.cabal
./vulkan/0.1.0.0/vulkan.hashes.json
./vulkan/0.1.0.0/vulkan.json
./wave
./wave/0.1.1
./wave/0.1.1/wave.cabal
./wave/0.1.1/wave.hashes.json
./wave/0.1.1/wave.json
./wave/0.1.2
./wave/0.1.2/wave.cabal
./wave/0.1.2/wave.hashes.json
./wave/0.1.2/wave.json
./wave/0.1.3
./wave/0.1.3/wave.cabal
./wave/0.1.3/wave.hashes.json
./wave/0.1.3/wave.json
./wave/0.1.4
./wave/0.1.4/wave.cabal
./wave/0.1.4/wave.hashes.json
./wave/0.1.4/wave.json
./wave/0.1.5
./wave/0.1.5/wave.cabal
./wave/0.1.5/wave.hashes.json
./wave/0.1.5/wave.json
./wavefront
./wavefront/0.1.0.1
./wavefront/0.1.0.1/wavefront.cabal
./wavefront/0.1.0.1/wavefront.hashes.json
./wavefront/0.1.0.1/wavefront.json
./wavefront/0.1.0.2
./wavefront/0.1.0.2/wavefront.cabal
./wavefront/0.1.0.2/wavefront.hashes.json
./wavefront/0.1.0.2/wavefront.json
./xattr
./xml
./yocto
DavHau commented 1 year ago

We could suffix each directory name with a few digits of the sha256 hash of its name, like this:

{
  "Base-7b47": {},
  "base-cae6": {},
   ...
}

It is still nicely readable and searchable and at the same time prevents collisions during extraction.

NobbZ commented 1 year ago

That should work, and the nice thing: On case insensitive but case preserving FSs (like MacOS HFS is by default), the difference between base-* and Base-* would be preserved visually!