brucemiller / LaTeXML

LaTeXML: a TeX and LaTeX to XML/HTML/ePub/MathML translator.
http://dlmf.nist.gov/LaTeXML/
Other
898 stars 96 forks source link

Support and document installation of LaTeXML via homebrew on Mac OS X #929

Closed asmaier closed 6 years ago

asmaier commented 6 years ago

Many people use homebrew as package manager on Mac OS X (see https://brew.sh/analytics/os-version/). Unfortunately LaTeXML doesn't officially seem to support installation via homebrew and the available package has some issues: https://github.com/Homebrew/homebrew-core/issues/22903 . Especially it is missing the dependency to Imagemagick and therefore after installation it is not possible to convert LaTeX files with images. But I managed to install LaTeXML using homebrew with Imagemagick and want to document this here. Maybe this is helpful for someone or even can be added to the official documentation:

  1. Install LaTeXML via homebrew
    $ brew install latexml
  2. Download the lastest version of ImageMagick (the formula for Imagemagick from homebrew doesn't work for LaTeXML)
    $ wget https://www.imagemagick.org/download/ImageMagick.tar.gz
    $ tar zxvf ImageMagick.tar.gz
    $ cd ImageMagick
  3. Build and compile ImageMagick from source
    $ ./configure --with-perl=/usr/bin/perl
    $ make
    $ make install
    $ make clean
  4. Open a new terminal and check if you can use the perl package
    $ /usr/bin/perl -le 'use Image::Magick; print Image::Magick->QuantumDepth'
    16

    After doing this the latexmlpost command should find the Image::Magick perl module and be able to convert images.

matteosecli commented 6 years ago

Hi @asmaier, I'm able to install and use LaTeXML with Homebrew (on High Sierra 10.13.2) via:

$ brew install imagemagick --with-perl
$ brew install cpanm
$ cpanm Archive::Zip DB_File File::Which Getopt::Long Image::Size IO::String JSON::XS LWP MIME::Base64 Parse::RecDescent Pod::Parser Text::Unidecode Test::More URI XML::LibXML XML::LibXSLT UUID::Tiny
$ brew install latexml --HEAD

which installs the latest GitHub version with all the required and optional dependencies.

I agree, anyway, that the formula should be updated and that a mention in the documentation would be nice! 😉

asmaier commented 6 years ago

@matteosecli Nice work. Your solution seems to work, because the HEAD version of latexml on github has fixed the shebang in the perl scripts from #!/usr/bin/perl -w to #!/usr/bin/env perl already. So the next release of LaTeXML will work better with perl installed via homebrew.

dginev commented 6 years ago

Excellent, thank you for all the information!

matteosecli commented 6 years ago

@asmaier I'll try to figure out how to fiddle around with the current LaTeXML Homebrew formula to include these fixes, since it appears from your bug report https://github.com/Homebrew/homebrew-core/issues/22903 that they only want pull requests.

asmaier commented 6 years ago

I had a weird problem. The installation of the HEAD version of LaTeXML following the steps of @matteosecli failed on my machine. The two perl modules XML::LibXML XML::LibXSLT couldn't be build and installed. It turned out, that I had to outcomment the following line from my .bash_profile file:

# export PATH="/Users/andi/anaconda/bin:$PATH"

So somehow my version of Anaconda (https://www.anaconda.com/download/#macos) interfered with the build process of the Perl modules XML::LibXML XML::LibXSLT. Just in case you encounter similar problems, have a look at your .bash_profile.

dginev commented 6 years ago

... could you give us a list of files inside that /bin directory? Any chance there is some custom perl or libxml bundled in it?

matteosecli commented 6 years ago

@asmaier I've sent a PR to the Homebrew folks with an updated recipe for LaTeXML. It should solve many of the issues here, including your problems with the XML::LibXML and XML::LibXSLT modules; the Homebrewed LaTeXML + all the necessary modules should live in its own folder and will use Homebrew's Perl, so there are less chances of interferences.

I've tested the recipe in a clean High Sierra install and it seems to work; could you please give it a try?

Btw I've tried to track down all the non-core Perl modules (and their dependencies) that LaTeXML depends on, but there are chances I've missed some of them. If you get errors about missing modules let me know! 😉


Edit: @asmaier I apologize, if you are on Sierra the new recipe in the PR won't work. If you are still willing to try it, I've fixed it here; now it should work!

asmaier commented 6 years ago

@dginev Don't worry too much about the problem with Anaconda. It turned out that my version of Anaconda was very old. I decided to simply delete it. And it seems to be a known problem, that Anaconda is often interfering with Homebrew, see https://hashrocket.com/blog/posts/keep-anaconda-from-constricting-your-homebrew-installs . So this is not a problem unique to LaTeXML.

asmaier commented 6 years ago

@matteosecli I did install your new homebrew formula with

brew install https://raw.githubusercontent.com/matteosecli/homebrew-core/patch-1/Formula/latexml.rb

on Mac OS X 10.12.6 and it worked flawlessly. Imagemagick was installed and I was able to convert my latex document with images included. Thank you very much for this new homebrew formula for LaTeXML!

matteosecli commented 6 years ago

@asmaier Happy to hear that it works! 😄

brucemiller commented 6 years ago

I'm a little confused by some of this. If there already is a Homebrew formula, the best approach would be to fix it (or get it fixed). It looks like @matteosecli submitted a PR, that wasn't accepted. While I don't know formula language, I took a look at it and was a bit confused. It seemed to have all sorts of dependencies (like HTTP::Cookie and lots of others) that aren't dependencies of LaTeXML. Perhaps this threw the Homebrew folks off, as well? The point being, I'm not sure how to distill this discussion into a "do this and that".

All the better: I've just spent some time reorganizing and fleshing out the wiki pages to hopefullly make them more welcoming :> Could I ask you guys @matteosecli and @asmaier to look at the "Installation Guide" wiki page and add something there? Thanks!!!

matteosecli commented 6 years ago

@brucemiller you're perfectly right, the formula has to pull lots of dependencies.

My understanding is that Homebrew packages are installed in a way that limits to the lowest possible level the interference with the OS. In a certain sense, they look like small "containers" that are designed to communicate only with other Homebrew "containers" or with system libraries. So, they have to include all the necessary pieces in order to work independently from other non-Homebrew software.

Then, in a Homebrew formula you can have two kind on dependencies:

  1. software that is already shipped as a Homebrew formula, for which you can simply use something like depends_on "perl";
  2. software that is not shipped as a Homebrew formula, like language-specific dependencies (in this case, Perl modules).

In the second case, the official guidelines say that you have to pull these dependencies as resources and install them locally (not globally) as part of the "container" you are building. Take a look, for example, at this formula (which is in the official repo and it's not dissimilar to the formula I wrote). So, if you need e.g. Perl modules, you are not allowed to call cpanm from the installer and install the modules system-wide; you have to provide them locally only to LaTeXML. Of course, LaTeXML has lots of dependencies — and each of them has its dependencies; for example, HTTP::Cookies is a non-core dependency of LWP (which instead is a LaTeXML dependency) and so you have to install it if you want to install LWP.

That being said, I've recently tried to simplify a lot the formula (see https://github.com/Homebrew/homebrew-core/pull/24254) by temporally using cpanm to fetch all the necessary Perl modules and install them in LaTeXML's "container", instead of manually taking care of all the sub-dependencies. The result is basically the same as before, but the formula is much more readable.

It turns out that the problem is not the readability or the complexity of the formula; they just don't like the fact that you have to pull many dependencies (despite the existence of formulas like the previous rex.rb that I linked). The only solution, according to them (see the comments in the last PR), is for you to ship LaTeXML tarballs with all the non-core modules included, which are built together with LaTeXML by a single Makefile; then the formula would just pull LaTeXML, make it, and that's done. That would mean, considering the sub-dependencies, that you would have to ship 40+ modules together with LaTeXML.

In addition, it appears that you cannot specify Homebrew dependencies with specific options. In this case, LaTeXMl depends on ImageMagick; however, Homebrew does not ship ImageMagick with it own Perl module by default. One could force a recompilation of ImageMagick with its Perl module by using depends_on "imagemagick" => "with-perl" in the formula, but this does not comply with the guidelines. Again the solution, according to them, is to ship ImageMagick with LaTeXML itself and build it locally (just for LaTeXML) with the --with-perl option.

So at this point, unless you really want to ship LaTeXML with all the modules and ImageMagick itself, I would say that there are three possibilities for the "Installation Guide" wiki:

  1. Advise people against installing LaTeXML with Homebrew, since the package is basically broken (LaTeXML is unusable if installed with Homebrew in a clean MacOS box) and people at Homebrew don't want to fix it unless there are heavy changes to the tarballs themselves.
  2. Provide LaTeXML users with a Homebrew formula which is maintained here and not in the Homebrew-core repo. Then the installation would just be brew install https://address-of/latexml.rb. However, I'd advise against it because in this way it would not possible to ship pre-compiled packages and (more importantly) provide updates, unless one sets up a specific Homebrew repository (called tap).
  3. Fill the gaps and write proper instructions for Homebrew users. That would include:
    • Install (or reinstall) ImageMagick with PerlMagick, via brew install imagemagick --with-perl;
    • Install the required Perl modules via cpanm: brew install cpanm && cpanm Archive::Zip DB_File File::Which Getopt::Long Image::Size IO::String JSON::XS LWP MIME::Base64 Parse::RecDescent Pod::Parser Text::Unidecode Test::More URI XML::LibXML XML::LibXSLT UUID::Tiny
    • Install LaTeXML itself brew install latexml. This would just pull a precompiled package of LaTeXML and install it, without caring about perl dependencies.

From my point of view, the third option is a hassle anyway since it requires different pre-installation steps; I would prefer just to pull the tarball and compile it myself. But it would have the benefit that, once installed the dependencies for the first time, all the subsequent updates would be automatic.

So, @brucemiller @asmaier what do you think? 🤔💭 Waiting for your thoughts!


PS: Sorry for the philippic! 😅

brucemiller commented 6 years ago

Sorry, I missed your response. Thanks for the detailed explanation of Homebrew formula. Actually, Homebrew's model (as I understand it) isn't actually much different from the other installation/repo systems I'm familiar with (rpm, debian, macports...), with exception of the apparent lack of formula for components. Are there really no pre-existing formula for perl's Archive::Zip, DB_File, LWP, etc? I tried searching, but maybe I did it wrong.Those "common" modules, I'd think, really deserve a separate formula of their own. Recursively disentangling their dependencies and embedding the resulting mess in any formula that needs them utterly violates any sensible modularity and maintainability. Although the resource mechanism seems to be a really useful last-resort, but hardly seems the right approach for most use cases.

But I'm confused about how introducing Image::Magick somehow cascaded into having to explicitly import all the LWP dependences. How did it work before? There's some comments in the other issues regarding which perl is used, but I didn't quite follow it. There's been a change since 0.8.2 involving the way perl gets invoked, in fact it now uses #!/usr/bin/env perl, so maybe that has an impact? Perhaps upcoming releases of LaTeXML will work as intended, and we should give up on mangling 0.8.2 to work around whichever limitations?

Not sure how to proceed for now, though. I would suppose it's possible to write a formula that uses the development version of LaTeXML, fetching a zip from github? That might answer some of these questions.

brucemiller commented 6 years ago

The best way I could think of to summarize this discussion was to put a link on the Wiki page to this discussion. Hope that helps somebody in the future.

nennigb commented 4 years ago

Hi,

I am working on a project (amc2moodle) that depends on latexml. Some users are using macOS but I never use it. In order to test the project, I would like to install latexml on the macOS Catalina 10.15.4 provided by github actions. Just using

brew install latexml

doesn't work. I have tried several approaches proposed here, but I still stuck with a XSLT trouble :

--> Working on XML::LibXSLT
Fetching http://www.cpan.org/authors/id/S/SH/SHLOMIF/XML-LibXSLT-1.99.tar.gz ... OK
Configuring XML-LibXSLT-1.99 ... OK
! Installing XML::LibXSLT failed. See /Users/runner/.cpanm/work/1588080424.2291/build.log for details. Retry with --force to force install it.
Building and testing XML-LibXSLT-1.99 ... FAIL

obtained with

# install brew package
brew cask install basictex
brew install imagemagick
brew install libxml2   # fix XML install problem
brew install libxslt   # doesn't help
#export PATH=/usr/local/opt/libxslt/bin:$PATH # try  with and without
brew install cpanm
brew update
cpanm Archive::Zip DB_File File::Which Getopt::Long Image::Size IO::String JSON::XS LWP MIME::Base64 Parse::RecDescent Pod::Parser Text::Unidecode Test::More URI XML::LibXML XML::LibXSLT UUID::Tiny
brew install latexml --HEAD

Have you an idea to solved this problem Thank you very munch,

Benoit

matteosecli commented 4 years ago

Hi @nennigb, I use LaTeXML on Mojave but I don't currently have a Catalina box to test on, so cannot help directly with that. However, I have similar issues on Mojave; I can only suggest to try the following things, based on the trials I did today on my machine.

  1. Take a look at the log file to try to identify the error.
  2. If the error is "minor", try to install XML::LibXSLT with the --force option, as suggested by the error message itself.
  3. A quick Google search suggests that there are indeed problems with XML::LibXSLT on Catalina, so it doesn't seem to be directly related to LaTeXML. I've found the following two resources (but there are many others describing the issue as well):

    If I've understood correctly the Japanese post, the problem is due to a reshuffle of system headers in Catalina (& Mojave as well); according to the post, if you've installed libxml2 via brew, it should be possible to fix the problem by installing XML::LibXSLT in this way:

    env PKG_CONFIG_PATH="/usr/local/opt/libxml2/lib/pkgconfig" cpanm XML::LibXSLT --force

To be honest, the solution in the Japanese post didn't work for me; so I've tried to manually build XML::LibXSLT until it (kinda) worked. I'll outline the steps for you here (or at least, what worked for me after many trials).

  1. First of all, update Xcode and/or the command line tools to the latest version available for your system, so to be sure to avoid undefined symbols and strange stuff from Homebrew.
  2. Run brew doctor and make sure everything looks fine.
  3. Then install the stuff you need from Homebrew: brew install imagemagick --with-perl; brew install cpanm (I also use Homebrew's perl, but I guess it should work with the system version as well).
  4. At this point, we will install XML::LibXSLT. It seems that the header files are no more symlinked to /usr/include/, but are instead kept in some obscure Xcode folder. Fortunately, there is a way to automatically get this path and all the necessary linking flags for the compiler via xslt-config; we also have to add an extra flag to link against libexslt, that for some reason is not added automatically (otherwise you get undefined symbols). These options are passed as additional arguments to the perl Makefile.PL run by cpanm, and can be directly provided to cpanm itself via the --configure-args option. In short, you can run the following command:
    cpanm --configure-args="INC=\"$(xslt-config --cflags)\" LIBS=\"$(xslt-config --libs) -lexslt\"" XML::LibXSLT --force

    A few caveats:

    • The versions of libxml and libxslt used to compile are the ones provided by xslt-config, which by default is the system one (in /usr/bin/).
    • I still have to use the --force option because some tests still fail and I didn't manage to fix them:
      Test Summary Report
      -------------------
      t/03input.t             (Wstat: 512 Tests: 11 Failed: 2)
        Failed tests:  8, 11
        Non-zero exit status: 2
        Parse errors: Bad plan.  You planned 28 tests but ran 11.
      t/14security.t          (Wstat: 512 Tests: 26 Failed: 2)
        Failed tests:  9, 22
        Non-zero exit status: 2
      Files=22, Tests=227,  2 wallclock secs ( 0.09 usr  0.04 sys +  2.10 cusr  0.35 csys =  2.58 CPU)
      Result: FAIL
      Failed 2/22 test programs. 4/227 subtests failed.

      So there is definitely room for improvement here, and LaTeXML might still not work properly.

    • You could override MacOS-provided libxml2 and libxslt with the ones provided by Homebrew. In this case, you first need to install them via brew install libxml2 libxslt and then use the commands (the first two lines are given by brew info libxml2 and brew info libxslt)
      export PATH="/usr/local/opt/libxml2/bin:$PATH"
      export PATH="/usr/local/opt/libxslt/bin:$PATH"
      cpanm --configure-args="INC=\"$(xml2-config --cflags) $(xslt-config --cflags)\" LIBS=\"$(xml2-config --libs) $(xslt-config --libs) -lexslt\"" XML::LibXSLT

      Notice that in this case I needed to add the xml2-config part, because the libxml2 flags were not automatically added by Homebrew's xslt-config, but I did't need to use the --force flag because all the tests succeeded:

      All tests successful.
      Files=22, Tests=244,  3 wallclock secs ( 0.09 usr  0.05 sys +  2.04 cusr  0.36 csys =  2.54 CPU)
      Result: PASS

      I didn't test LaTeXML in neither case, so I cannot say whether it works better with the system libraries or with the ones provided by Homebrew; if I manage to do some tests I'll report back, or maybe if you are willing to do some tests are report back yourself you're more than welcome. So, for, now, it's up to you to choose either the system version (which still fails some tests, though) or Homebrew's version (with the caveat that you have additional dependencies and things to install in general).

  5. Install all the dependencies via cpanm:
    cpanm Archive::Zip DB_File File::Which Getopt::Long Image::Size IO::String JSON::XS LWP MIME::Base64 Parse::RecDescent Pod::Parser Text::Unidecode Test::More URI XML::LibXML XML::LibXSLT UUID::Tiny
  6. Install LaTeXML via Homebrew:
    brew install latexml

Now, hopefully, you have a working LaTeXML version! 😃

matteosecli commented 4 years ago

I've just tested the head version of LaTeXML with XML::LibXSLT built on top of Homebrew's custom libraries. I've used

git clone https://github.com/brucemiller/LaTeXML.git && cd LaTeXML/
perl Makefile.PL 
make test

and I got:

t/002_unit_findfile.t ..... ok    
t/00_unittest.t ........... ok    
t/05_tokenize.t ........... ok     
t/10_expansion.t .......... ok     
t/12_grouping.t ........... ok   
t/170_grammar_coverage.t .. skipped: Only checked in continuous integration.
t/20_digestion.t .......... ok     
t/22_fonts.t .............. ok     
t/30_encoding.t ........... ok     
t/32_keyval.t ............. ok   
t/33_keyval_options.t ..... ok     
t/40_math.t ............... ok     
t/50_structure.t .......... ok     
t/52_namespace.t .......... ok   
t/53_alignment.t .......... ok     
t/55_theorem.t ............ ok   
t/56_ams.t ................ ok   
t/65_graphics.t ........... ok   
t/70_parse.t .............. ok     
t/80_complex.t ............ ok     
t/81_babel.t .............. ok   
t/82_moderncv.t ........... ok   
t/83_expl3.t .............. 1/3 # Skip: Minimal texlive 2018 requirement not met for t/expl3/tilde_tricks
t/83_expl3.t .............. 2/3 # Skip: Minimal texlive 2018 requirement not met for t/expl3/xparse
t/83_expl3.t .............. ok   
t/90_latexmlpost.t ........ ok   
t/91_latexmlc_api.t ....... ok   
t/92_profiles.t ........... ok     
t/931_epub.t .............. ok   
t/93_formats.t ............ ok     
t/94_runtimes.t ........... ok   
t/95_complex_config.t ..... ok   
t/96_fatal.t .............. ok   
t/97_manifest.t ........... skipped: Only checked in continuous integration.
All tests successful.
Files=32, Tests=408, 400 wallclock secs ( 0.15 usr  0.07 sys + 386.29 cusr  9.45 csys = 395.96 CPU)
Result: PASS

so I guess it's working correctly, at least with Homebrew's libraries.

I've also tried to convert to epub some random paper from arXiv and seems to be working fine.

nennigb commented 4 years ago

Hi, thank you for your help.

My workflow now contains your steps. The following code allows to install latexml on github actions catalina.

        # install brew package
        brew cask install basictex   # basictex doesn't contain tikz
        brew install libxml2
        brew install libxslt
        brew install imagemagick
        brew install cpanm

        # perl package
        export PATH="/usr/local/opt/libxml2/bin:$PATH"
        export PATH="/usr/local/opt/libxslt/bin:$PATH"
        cpanm --configure-args="INC=\"$(xml2-config --cflags) $(xslt-config --cflags)\" LIBS=\"$(xml2-config --libs) $(xslt-config --libs) -lexslt\"" XML::LibXSLT
        cpanm Archive::Zip DB_File File::Which Getopt::Long Image::Size IO::String JSON::XS LWP MIME::Base64 Parse::RecDescent Pod::Parser Text::Unidecode Test::More URI XML::LibXML UUID::Tiny # XML::LibXSLT
        brew install latexml 

To fully test my application, I still need tikz, but its not your problem ;-) Thanks a lot,

Benoit

matteosecli commented 4 years ago

That's great, glad it worked! 😄

nennigb commented 4 years ago

Hi,

LaTeXML does not detect tikz (and keyval) and raises

Warning:missing_file:tikz Can't find package tikz
    at tikz.sty.ltxml; line 24
    Anticipate undefined macros or environments
    In Core::Gullet[@0x7fc4169eb0e0] /usr/local/Cellar/latexml/0.8.4/libexec/lib/perl5/LaTeXML/Package/tikz.sty.ltxml; line 24
 0.00 sec)

Before installing LaTeXML, I have run

brew cask install basictex   # basictex doesn't contain tikz  
export PATH="/usr/local/texlive/2020basic/bin/x86_64-darwin:$PATH"  
sudo tlmgr install pgf latex-graphics amsmath

I assume that I still have an environment variable problem, Any idea ?

Thank a lot, Benoit

nennigb commented 4 years ago

After further investigation, the problem comes from my wrong conception of environment variable (like PATH here) in github action. These variable are not conserved between steps (subprocess). Once fixed, everything seem to works properly. An exemple of workflow for ubuntu and macos can be found here Thank for you help, Benoit