Closed kiryph closed 8 months ago
Thanks for the report, I was not expecting Apple to be less compatible than windows (+Gow) on this point, and I was a bit afraid to use <
as it is shell-dependent, but it seems like Windows also supports this syntax so it should be good to use it. So I fixed it in master (cf 3993798cc86481a6deb0f735aa56d8af7322c223).
But I’m not so sure that it is a good idea to use -P 0
, as it will run start at the same time all the commands. So I made a test with 200 pictures, and all of them were compiled at the same time:
$ ps aux | rg pdflatex | wc -l
197
(while the same command with the current setting would give me something around 16) While one might think that it is faster this way, since latex is a CPU-intensive operation, this has many drawbacks:
-P 16
. The reason (I think), is that a lot of time is lost in order to switch between the 200 processes.The 16 I chose might seem arbitrary, but I chose it since nowadays it is quite frequent to have between 4 to 16 CPU threads (mine have 8 for instance), and choosing to run more threads than the actual number of CPU threads is not a problem (it might actually be faster to a certain degree, or slower if you put way to many… but here it should not be that bad in either cases). I could have tried to compute the number of threads at runtime, but it is quite hard to do in an OS-independent way without installing new stuff, so I decided to keep 16 by default. And anyway, it's super simple to change with compile in parallel with xargs=N
, and GNU parallel already adapts to the number of CPU if needed.
I hope this makes sense. I will close this issue for now, but please could you check the latest version to check if it solves your problem and reopen if not?
Thanks for your detailed answer
I was not expecting Apple to be less compatible than windows (+Gow) on this point
I already encountered several times when the BSD programs, which macOS contains, do not have the same features as equivalent GNU programs.
A general macOS issue is that (newer) GNU programs can have incompatible licenses . An example: macOS comes with following shells preinstalled:
❯ ls /bin/*sh
/bin/bash /bin/csh /bin/dash /bin/ksh /bin/sh /bin/tcsh /bin/zsh
The preinstalled bash is actually a GNU bash but it is an outdated version 3.2 from 2006. But Apple will not have newer versions of GNU bash due to a license change in GNU Bash 4.0. https://apple.stackexchange.com/questions/193411/update-bash-to-version-4-0-on-osx
In contrast the zsh shell is the most recent one (version 5.9) from 2022.
I was a bit afraid to use < as it is shell-dependent
Also GNU Bash 3.2 understands it (/bin/sh
is /bin/bash
under macOS)
sh-3.2$ <TODO.md xargs -I '{}' echo '{}'
# prints the content of the file TODO.md (possibly in random order)
But also dash
, csh
, ksh
, tcsh
. So I would assume the syntax is actually widely accepted by shells. (Just out of curiousity, if you happen to know one where it is not a valid, let me know.)
first, the whole system gets really laggy as it gets 200 CPU-intensive tasks to run at once, leaving little time for other processes to run,
True, but on my system even 16 processes will take all cores and I will hear the ventilator kicking in. One can use nice
to set lower priority of the compilation so that other user processes get higher priority and the system does not get laggy.
I hope this makes sense.
Yes, it makes sense to me. I agree that 16 might be a good value for current personal computers.
And anyway, it's super simple to change with compile in parallel with xargs=N, and GNU parallel already adapts to the number of CPU if needed.
Yes, and I understand your motivation and I do not see a real problem with the default of 16.
If it bothers me that my 6 cores are taken for compilation of a document, I can set it to a lower value myself. People on 48+ core systems might pick a higher value. So it will always be personal choice.
For a shell that is not compatible with the >
syntax, you have nushell for instance, that would use pipe instead like open foo.txt | yourprogram
http://www.nushell.sh/book/loading_data.html (but if I use pipe, then I need a different command for windows since cat is not available by default…).
Oh I see, you want less than the number of cores… I do expect it to take all cores by default, at least it is what I would prefer to do by default. What I’d like to avoid is to crash the system. Is your whole OS laggy when it compiles? Anyway, since I guess it's very much user dependent, I guess it's better to let the user change the setting if they don't like it. I could set 8 instead of 16, but then people running > 8 threads might compile slower…
For a shell that is not compatible with the > syntax, you have nushell for instance
Thanks for the pointer. However, I think nushell
is and probably will not become for the foreseeable future a default shell in an OS. Adding support for programs installed by user and one would enter a never ending story.
Oh I see, you want less than the number of cores… I do expect it to take all cores by default,
No, I actually would prefer to take all cores by default. But I could imagine that this could be a reason someone wants to change it to ensure that other tasks can get a full core (or several) not shared with a compilation process (no process switching in the cpu, ...).
What I’d like to avoid is to crash the system. Is your whole OS laggy when it compiles?
No, I did not encounter a laggy OS when compiling. Maybe I have to create a document with 200+ environments to see if this could make the system laggy.
However, right now it works very well (with the value of 16), so I do not see a reason for myself wanting it to change it.
Ok great. Yeah, nushell
is unlikely to become a default shell instead in any OS… But do you know if pdflatex always picks sh
/cmd
, or if it picks the default shell of the user?
Ok, perfect then!
Running version v2.1 with
on macOS Ventura, I get following error:
One can remove the "illegal option -- a" as following:
The modified
robust-externalize.sty
would look like:IMHO, the default number of processes should not be set (i.e. not 16 but the value 0 can be chosen). The current value is arbitrary. For my current machine, the limit is higher than the available cores. If someone needs to limit it, the person should chose a suitable value for his/her machine/environment.