NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.75k stars 13.86k forks source link

okular pulls in mbrola worth > 600 mb #207204

Open asdf8dfafjk opened 1 year ago

asdf8dfafjk commented 1 year ago

This should be an optional dependcy

hmenke commented 1 year ago
$ nix-instantiate --eval -A okular.outPath '<nixpkgs>'
"/nix/store/6c0h3b3a3p8qnhjipmxjfygsdqr027i8-okular-22.08.3"
$ nix-store -qR /nix/store/6c0h3b3a3p8qnhjipmxjfygsdqr027i8-okular-22.08.3 | grep mbrola
/nix/store/cfzmf36asbcjiq5h4pn4fznzk6rqc4gn-mbrola-3.3
$ nix why-depends /nix/store/6c0h3b3a3p8qnhjipmxjfygsdqr027i8-okular-22.08.3 /nix/store/cfzmf36asbcjiq5h4pn4fznzk6rqc4gn-mbrola-3.3
/nix/store/6c0h3b3a3p8qnhjipmxjfygsdqr027i8-okular-22.08.3
└───/nix/store/fazbdx73b4w17sd0mvphdbpm7zci38nw-qtspeech-5.15.7
    └───/nix/store/cl38aj8cxhfjjraic8r4dks29gq0yqig-speech-dispatcher-0.11.2
        └───/nix/store/cfzmf36asbcjiq5h4pn4fznzk6rqc4gn-mbrola-3.3

Would have to be made an optional dependency here:

https://github.com/NixOS/nixpkgs/blob/cc1bb678f4205ad1c4db34c20b4327cb2cc89a93/pkgs/development/libraries/speechd/default.nix#L42-L48

gbtb commented 1 year ago

I can suggest you to add an overlay disabling all heavy speech synthesizers backends, as I did for my config:

arcanPackages.espeak = super.arcanPackages.espeak.override { mbrolaSupport = false; pcaudiolibSupport = false; sonicSupport = false; };
nixos-discourse commented 1 year ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/switch-cache-nixos-org-to-zstd-to-fix-slow-nixos-updates-nix-downloads/23961/44

vcunat commented 1 year ago

It's also pulled by GNOME parts, including the ISO. Or at least it was, recently: https://github.com/NixOS/nixpkgs/issues/159612

I agree that such huge speech synthesizers should not be contained in defaults (of DEs, ISOs, and other packages except those very focused on speech synthesis)

PedroHLC commented 1 year ago

(Sorry, miss click)

asdf8dfafjk commented 1 year ago

Same with kmail. mbrola needs to be explicitly enabled, it's too big :-(

SuperSandro2000 commented 1 year ago

I am not sure if this is actually worth it unless we also drop mbrola from chromium. I had an overlay for this locally before but there is not point in having it when I need to compile chromium.

Atemu commented 1 year ago

I think it may be time to just disable it by default. That will suck for users of speech synthesis but we can't have everyone suffer >600MB in their closure for a select few.

A few questions before we do that:

Does our speech synth even work?
Why does the data need to be compiled into the respective apps?
Could they not look for data via env vars instead?

Atemu commented 1 year ago

Alright, I had a quick look at what we're doing at it seems we hard-code the output path of the voices into espeak-ng.

What we should instead be doing IMO is:

  1. Ask upstream to provide a better mechanism for providing mbrola voice data to espeak-ng than a hard-coded path (i.e. env var)
  2. Patch the espeak-ng discovery algorithm to look for data in /run/current-system/sw/share (NixOS compat) and then in /usr/share (non-NixOS compat) instead of hard-coding mbrola-data
  3. Properly split mbrola outputs (current drv is just weird, it already has two separate drvs but they're symlinkJoined? Why?)
  4. Provide a mechanism in NixOS to optionally install mbrola-data for espeak system-wide, disabled by default.
asdf8dfafjk commented 1 year ago

https://github.com/espeak-ng/espeak-ng/issues/1779

dotlambda commented 1 year ago

Why don't we patch espeak-ng to use /run/current-system/sw/share in https://github.com/espeak-ng/espeak-ng/blob/master/src/libespeak-ng/synth_mbrola.c#L93 ?

Atemu commented 1 year ago

@dotlambda we're currently patching it to use /nix/store/...-mbrola/share.

Doing that instead (and perhaps also /usr for non-NixOS people) is what I meant with step 1.

chuangzhu commented 1 year ago
  1. Patch the espeak-ng discovery algorithm to look for data in /run/current-system/sw/share (NixOS compat) and then in /usr/share (non-NixOS compat) instead of hard-coding mbrola-data

$XDG_DATA_DIRS and $NIX_PROFILES/share may be better choices. That's exactly what we are doing with aspell dictionaries.