PreTeXtBook / pretext

PreTeXt: an authoring and publishing system for scholarly documents
https://pretextbook.org
Other
254 stars 203 forks source link

Replace pdfcrop dependency with pip-installable pdf-crop-margins (works in Windows) #1329

Closed StevenClontz closed 3 years ago

StevenClontz commented 3 years ago

Closes #1327. End users will need to pip install pdfCropMargins, unless they have version >=0.1rc0 of PreTeXt-CLI (since I'll be adding it as a dependency shortly).

rbeezer commented 3 years ago

Code looks good. I'd like to do one test before we proceed with this.

Have @mitchkeller send a pre-cropped PDF of his gnarliest "overpic" PDF, perhaps with additions outside the normal margins, and see how it behaves when cropped? I think this was a big part of the motivation on #328.

StevenClontz commented 3 years ago

We should also do one test on Windows -- I know the subject says "works in Windows" but that's referring to using the pdf-crop-margins CLI directly; I only tested python pretext.py on Linux (CoCalc) with this change.

sean-fitzpatrick commented 3 years ago

I can run a test on Windows, but maybe not until Monday

rbeezer commented 3 years ago

Code looks good. I'd like to do one test before we proceed with this.

Have @mitchkeller send a pre-cropped PDF of his gnarliest "overpic" PDF, perhaps with additions outside the normal margins, and see how it behaves when cropped? I think this was a big part of the motivation on #328.

mitchkeller commented 3 years ago

It's unlikely I can touch this until Tuesday afternoon. It would be helpful to know more about what exactly is wanted: just the block of PTX code and the associated image that overpaid is working on or a pointer into my code or what? If someone wanted to explore before then, they could go to my applied-combinatorics repo and grep the source for overpic and see what results with the new script. The "correct" version of that image will be in the posted version of my book.

rbeezer commented 3 years ago

Tried this with my installation, after adding into a virtual environment from previous testing of CLI, and using pretext/pretext. No cropping apparent.

Testing various things at the command-line independently, and I think an in-place operation does not succeed. I had success with a different output filename, but when identical, I got:

Error in pdfCropMargins: The input file is the same as the output file.

Also tested, but inconclusive: the list of "words" in a command line for subprocess.call() can be a bit touchy. Spaces get handled strangely. It could need to be something like this "no-space" version:

pcm_cmd = [pcm_executable, latex_image_pdf, "-o", latex_image_pdf, "-p","0"]

I'm away for the day, or I'd do a bit more digging, but wanted to report this before heading out.

(I really need to put some work into logging/capturing output like errors and warnings....)

rbeezer commented 3 years ago

overpaid is working on

I know things are bad, but ... ;-)

A PDF that makes non-trivial use of overpic, and especially if the additions force the margins to be farther out would be very helpful. Perhaps you can just add one extreme PDF here? (Rather than us guessing what might be the best test case?)

mitchkeller commented 3 years ago

I can find something tomorrow afternoon, but I've got a slate of meetings I'm dealing with before then. It's going to take me some work to figure out which overpic images (Didn't notice my Mac autocorrecting that in the last message :).) were problematic.

sean-fitzpatrick commented 3 years ago

Running pretext.py against APEX using this branch, on Windows. I'll let you know what happens. Still one minor Windows annoyance I haven't figured out: if you try to use Git Bash, running the pretext script gets you: /usr/bin/env: 'python3': No such file or directory

This is annoying for two reasons:

  1. In the .bashrc for Git Bash I have an alias pointing python3 to python, and doing python3 --version returns 3.7.1. So this has to be fixed somewhere other than Git Bash. (I'll look into this.)
  2. In Anaconda you have to type out the full command every time, with absolute paths. (And those are absolute Windows paths, which are abominable.)
sean-fitzpatrick commented 3 years ago

OK. I've made several attempts, and the script seems to be failing when it calls pdfCropMargins. Output:

(base) C:\Users\sean.fitzpatrick>pip install --user pdfCropMargins --upgrade
Collecting pdfCropMargins
Collecting PyPDF2 (from pdfCropMargins)
Requirement already satisfied, skipping upgrade: wheel in c:\programdata\anaconda3\lib\site-packages (from pdfCropMargins) (0.33.1)
Collecting pillow>=6.2.2 (from pdfCropMargins)
  Using cached https://files.pythonhosted.org/packages/36/fd/f83806d04175c0a58332578143ee7a9c5702e6e0f134e157684c737ae55b/Pillow-7.2.0-cp37-cp37m-win_amd64.whl
Installing collected packages: PyPDF2, pillow, pdfCropMargins
  The script pdf-crop-margins.exe is installed in 'C:\Users\sean.fitzpatrick\AppData\Roaming\Python\Python37\Scripts' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed PyPDF2-1.26.0 pdfCropMargins-0.2.8 pillow-7.2.0

(base) C:\Users\sean.fitzpatrick>python C:\PreTeXt\mathbook\pretext\pretext -vv -c latex-image -f svg -d C:\Users\sean.fitzpatrick\Documents\GitHub\APEXCalculusPTX\output\images C:\Users\sean.fitzpatrick\Documents\GitHub\APEXCalculusPTX\ptx\index.ptx
PTX:DEBUG: Parsed CLI args {'verbose': 2, 'component': 'latex-image', 'format': 'svg', 'publisher_file': None, 'stringparams': [], 'xmlid': '', 'server': None, 'data_dir': None, 'out': None, 'dir': 'C:\\Users\\sean.fitzpatrick\\Documents\\GitHub\\APEXCalculusPTX\\output\\images', 'abort': False, 'xml_file': 'C:\\Users\\sean.fitzpatrick\\Documents\\GitHub\\APEXCalculusPTX\\ptx\\index.ptx'}
PTX:DEBUG: Python version: 3.7 (expecting 3.4 or newer)
PTX:DEBUG: discovered distribution and xsl directories: C:\PreTeXt\mathbook, C:\PreTeXt\mathbook\xsl
PTX:DEBUG: executables in configuration file: {'xslt': 'xsltproc', 'tex': 'xelatex', 'pdfsvg': 'pdf2svg', 'asy': 'asy', 'sage': 'sage', 'pdfpng': 'c:/PreTeXt/ImageMagick-7.0.8-Q16/convert.exe', 'pdfeps': 'pdftops', 'pcm': 'pdf-crop-margins', 'pageres': 'pageres', 'mjpage': '/home/user-name-here/node_modules/mathjax-node-page/bin/mjpage', 'liblouis': 'file2brl', 'pdfcrop': 'pdfcrop'}
PTX: verifying and expanding input directory: C:\Users\sean.fitzpatrick\Documents\GitHub\APEXCalculusPTX\output\images
PTX: input directory expanded to absolute path: C:\Users\sean.fitzpatrick\Documents\GitHub\APEXCalculusPTX\output\images
PTX: verifying and expanding input file: C:\Users\sean.fitzpatrick\Documents\GitHub\APEXCalculusPTX\ptx\index.ptx
PTX: input file expanded to absolute path: C:\Users\sean.fitzpatrick\Documents\GitHub\APEXCalculusPTX\ptx\index.ptx
PTX: Done examining environment and initializing setup info
PTX: converting latex-image pictures from C:\Users\sean.fitzpatrick\Documents\GitHub\APEXCalculusPTX\ptx\index.ptx to svg graphics for placement in C:\Users\sean.fitzpatrick\Documents\GitHub\APEXCalculusPTX\output\images
PTX: string parameters passed to extraction stylesheet: {}
PTX:DEBUG: temporary directory for latex-image conversion: C:\Users\SEAN~1.FIT\AppData\Local\Temp\tmpki2d3wow
PTX: extracting latex-image pictures from C:\Users\sean.fitzpatrick\Documents\GitHub\APEXCalculusPTX\ptx\index.ptx
PTX: XSL conversion of C:\Users\sean.fitzpatrick\Documents\GitHub\APEXCalculusPTX\ptx\index.ptx by C:\PreTeXt\mathbook\xsl\extract-latex-image.xsl
PTX:DEBUG: XSL conversion via C:\PreTeXt\mathbook\xsl\extract-latex-image.xsl of C:\Users\sean.fitzpatrick\Documents\GitHub\APEXCalculusPTX\ptx\index.ptx to None and/or into directory C:\Users\SEAN~1.FIT\AppData\Local\Temp\tmpki2d3wow with parameters {}
PTX:DEBUG: locating "tex" in [executables] section of configuration file
PTX:WARNING: executable existence-checking was not performed (e.g. on Windows)
PTX:DEBUG: tex executable: xelatex
PTX:DEBUG: tex executable: xelatex
PTX: converting cd-gen-chain-derivative.tex to cd-gen-chain-derivative.pdf
PTX:DEBUG: locating "pcm" in [executables] section of configuration file
PTX:WARNING: executable existence-checking was not performed (e.g. on Windows)
PTX:DEBUG: pcm executable: pdf-crop-margins
PTX:DEBUG: pdf-crop-margins executable: pdf-crop-margins
PTX: cropping cd-gen-chain-derivative.pdf to cd-gen-chain-derivative.pdf
Traceback (most recent call last):
  File "C:\PreTeXt\mathbook\pretext\pretext", line 290, in <module>
    main()
  File "C:\PreTeXt\mathbook\pretext\pretext", line 231, in main
    ptx.latex_image_conversion(xml_source, stringparams, args.xmlid, data_dir, dest_dir, 'svg')
  File "C:\PreTeXt\mathbook\pretext\pretext.py", line 306, in latex_image_conversion
    subprocess.call(pcm_cmd, stdout=devnull, stderr=subprocess.STDOUT)
  File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 323, in call
    with Popen(*popenargs, **kwargs) as p:
  File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 775, in __init__
    restore_signals, start_new_session)
  File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 1178, in _execute_child
    startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

(base) C:\Users\sean.fitzpatrick>

Notes: Ghostscript is installed, and the path is set in Windows system environment variables. I also added the path to pdfCropMargins as indicated with the pip install.

Checking the temp folder tells me that the .tex file was created, and successfully compiled to pdf. The resulting pdf is not cropped, so it is dying at that step.

sean-fitzpatrick commented 3 years ago

Update: rebooting the computer (of course, it's Windows) is apparently required for the pip install to take effect. So the script now runs without errors.

BUT: it still doesn't crop anything. :-(

StevenClontz commented 3 years ago

The lack of actual cropping is on me -- it didn't return any errors, but I was a dummy and didn't bother to check to make sure things actually got cropped.

My guess is that this suggestion by @rbeezer will fix things:

It could need to be something like this "no-space" version:

sean-fitzpatrick commented 3 years ago

I tried Rob's version of the command but got the same results.

StevenClontz commented 3 years ago

I think 530a0b1706f08d02e67beb4ec6a9f9a2c0fe51f8 works now, but rather than doing the conversion twice, the LaTeX should build to something like tmp-{filename} so pdf-crop-margins can convert it to the correct filename without overwriting itself.

sean-fitzpatrick commented 3 years ago

It's working now for me. At least, for the first 3 images that came off the assembly line. I'm not going to verify all the images because Windows is apparently incapable of providing a preview for SVGs and I'm not keen on manually opening each one in Inkscape. (One reason among many that today is the first day I've bothered to turn on my Windows computer since May, when I was testing something for Rob.)

rbeezer commented 3 years ago

Test w/ an image with overpic from Mitch Keller. Note "natural gas" in cropped version.

sample.pdf sample-crop.pdf

rbeezer commented 3 years ago

-p 0 -a -1 does a 0% margin and adds a single "big point" (a negative decrease). Seems to fix the descender on the "g".

sample-crop-1-point.pdf

rbeezer commented 3 years ago

This worked great for me in my shiny new virtual environment. On the other hand, Ubuntu 18.04's pip implementation is a bit brain dead. Short version, and for the record, I needed to set the pretext.cfg file with

pdfcrop = /home/rob/.local/bin/pdf-crop-margins

I only found the executable with an application of find.

Merging just now.

StevenClontz commented 3 years ago

We have folks add their pip commands to PATH in the CLI documentation to be released (includes Windows instructions):

https://github.com/PreTeXtBook/pretext-cli/blob/44b00814298ce68764c2a53579f9efc42b2ede96/source/main.ptx#L83

On Tue, Jul 7, 2020, 6:42 PM Rob Beezer notifications@github.com wrote:

This worked great for me in my shiny new virtual environment. On the other hand, Ubuntu 18.04's pip implementation is a bit brain dead. Short version, and for the record, I needed to set the pretext.cfg file with

pdfcrop = /home/rob/.local/bin/pdf-crop-margins

I only found the executable with an application of find.

Merging just now.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/rbeezer/mathbook/pull/1329#issuecomment-655196079, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAL4YUA7TPHXKA5OCI666JDR2OXIBANCNFSM4OQPUAVA .

rbeezer commented 3 years ago

Combined the commits and edited the commit message. Otherwise as-is. Great work. Thanks to those who contributed to this one.

rbeezer commented 3 years ago

add their pip commands to PATH

Right. ;-) I'm conservative about this and tend to not mess with whatever my system thinks is best. So as a consequence need to do things like a full path in my pretext.cfg.

StevenClontz commented 3 years ago

For anyone else avoiding modifying their PATH, the magic

echo `python -m site --user-base`/bin/pdf-crop-margins

is useful for figuring out what to put in pretext.cfg.