Closed eliselavy closed 6 months ago
Until now, no possibility to use the possibilty of pdf generation proposed by:
due to the use of cite2c
Idea to Use citeproc-py to transform the citation in the notebook and after run the nbconvert
Based on the method used for the De Gruyter pdf generation: https://github.com/C2DH/journal-of-digital-history-backend/blob/develop/jdhseo/utils.py#L53
Notebook with cite2c markdown generated via celery task Now need to integrate the pdf generation:
jupyter nbconvert --to pdf MyNotebook.ipynb --TagRemovePreprocessor.remove_input_tags remove_input
By tagging with remove_input, not input cell rendered:
Need to get visible the hermeneutics paragraph
Problem deployment in development
celery_1 | logger.error("Command output:\n", e.output)
celery_1 | [2023-12-05 10:18:22,888: WARNING/ForkPoolWorker-2] Message: 'Command output:\n'
celery_1 | Arguments: ('[NbConvertApp] Converting notebook notebook_with_ref.ipynb to pdf\n[NbConvertApp] ERROR | Error while converting \'notebook_with_ref.ipynb\'\nTraceback (most recent call last):\n File "/usr/local/lib/python3.8/site-packages/nbconvert/nbconvertapp.py", line 435, in export_single_notebook\n output, resources = self.exporter.from_filename(notebook_filename, resources=resources)\n File "/usr/local/lib/python3.8/site-packages/nbconvert/exporters/exporter.py", line 190, in from_filename\n return self.from_file(f, resources=resources, **kw)\n File "/usr/local/lib/python3.8/site-packages/nbconvert/exporters/exporter.py", line 208, in from_file\n return self.from_notebook_node(nbformat.read(file_stream, as_version=4), resources=resources, **kw)\n File "/usr/local/lib/python3.8/site-packages/nbconvert/exporters/pdf.py", line 168, in from_notebook_node\n latex, resources = super().from_notebook_node(\n File "/usr/local/lib/python3.8/site-packages/nbconvert/exporters/latex.py", line 72, in from_notebook_node\n return super().from_notebook_node(nb, resources, **kw)\n File "/usr/local/lib/python3.8/site-packages/nbconvert/exporters/templateexporter.py", line 392, in from_notebook_node\n output = self.template.render(nb=nb_copy, resources=resources)\n File "/usr/local/lib/python3.8/site-packages/jinja2/environment.py", line 1291, in render\n self.environment.handle_exception()\n File "/usr/local/lib/python3.8/site-packages/jinja2/environment.py", line 925, in handle_exception\n raise rewrite_traceback_stack(source=source)\n File "/usr/local/share/jupyter/nbconvert/templates/latex/index.tex.j2", line 8, in top-level template code\n ((* extends cell_style *))\n File "/usr/local/share/jupyter/nbconvert/templates/latex/style_jupyter.tex.j2", line 176, in top-level template code\n \\prompt{(((prompt)))}{(((prompt_color)))}{(((execution_count)))}{(((extra_space)))}\n File "/usr/local/share/jupyter/nbconvert/templates/latex/base.tex.j2", line 7, in top-level template code\n ((*- extends \'document_contents.tex.j2\' -*))\n File "/usr/local/share/jupyter/nbconvert/templates/latex/document_contents.tex.j2", line 51, in top-level template code\n ((*- block figure scoped -*))\n File "/usr/local/share/jupyter/nbconvert/templates/latex/display_priority.j2", line 5, in top-level template code\n ((*- extends \'null.j2\' -*))\n File "/usr/local/share/jupyter/nbconvert/templates/latex/null.j2", line 30, in top-level template code\n ((*- block body -*))\n File "/usr/local/share/jupyter/nbconvert/templates/latex/base.tex.j2", line 215, in block \'body\'\n ((( super() )))\n File "/usr/local/share/jupyter/nbconvert/templates/latex/null.j2", line 32, in block \'body\'\n ((*- block any_cell scoped -*))\n File "/usr/local/share/jupyter/nbconvert/templates/latex/null.j2", line 85, in block \'any_cell\'\n ((*- block markdowncell scoped-*)) ((*- endblock markdowncell -*))\n File "/usr/local/share/jupyter/nbconvert/templates/latex/document_contents.tex.j2", line 68, in block \'markdowncell\'\n ((( cell.source | citation2latex | strip_files_prefix | convert_pandoc(\'markdown+tex_math_double_backslash\', \'json\',extra_args=[]) | resolve_references | convert_pandoc(\'json\',\'latex\'))))\n File "/usr/local/lib/python3.8/site-packages/nbconvert/filters/pandoc.py", line 24, in convert_pandoc\n return pandoc(source, from_format, to_format, extra_args=extra_args)\n File "/usr/local/lib/python3.8/site-packages/nbconvert/utils/pandoc.py", line 52, in pandoc\n check_pandoc_version()\n File "/usr/local/lib/python3.8/site-packages/nbconvert/utils/pandoc.py", line 100, in check_pandoc_version\n v = get_pandoc_version()\n File "/usr/local/lib/python3.8/site-packages/nbconvert/utils/pandoc.py", line 77, in get_pandoc_version\n raise PandocMissing()\nnbconvert.utils.pandoc.PandocMissing: Pandoc wasn\'t found.\nPlease check that pandoc is installed:\nhttps://pandoc.org/installing.html\n',)
Problem to install pandoc https://github.com/C2DH/journal-of-digital-history-backend/actions/runs/7099856408/job/19324737062
In the notebook:
But in the pdf:
Same look and feel:
Works:
{
"cell_type": "markdown",
"metadata": {},
"source": [
"hermeneutics\n",
"\n",
"## Introduction\n",
"\n",
"end hermeneutics"
]
},
Doesn't work:
"source": [
"hermeneutics remove line\n",
"## Introduction\n",
"end hermeneutics remove line"
]
Problem Latex take into account text/plain, dataframe render as: <pandas.io.formats.style.Styler at 0x11ac2e150>
{
"data": {
"text/html": [
"<style type=\"text/css\" >\n",
"</style><table id=\"T_2b14b_\" ><caption>table 1: Some figures and their mentions in the Capuchin Annual between 1930 and 1965</caption><thead> <tr> <th class=\"col_heading level0 col0\" >HenryVIII</th> <th class=\"col_heading level0 col1\" >Victoria</th> <th class=\"col_heading level0 col2\" >WilliamOrange</th> <th class=\"col_heading level0 col3\" >FatherMathew</th> <th class=\"col_heading level0 col4\" >Parnell</th> <th class=\"col_heading level0 col5\" >WolfeTone</th> <th class=\"col_heading level0 col6\" >ElizabethI</th> <th class=\"col_heading level0 col7\" >Cromwell</th> </tr></thead><tbody>\n",
" <tr>\n",
" <td id=\"T_2b14b_row0_col0\" class=\"data row0 col0\" >19</td>\n",
" <td id=\"T_2b14b_row0_col1\" class=\"data row0 col1\" >20</td>\n",
" <td id=\"T_2b14b_row0_col2\" class=\"data row0 col2\" >25</td>\n",
" <td id=\"T_2b14b_row0_col3\" class=\"data row0 col3\" >37</td>\n",
" <td id=\"T_2b14b_row0_col4\" class=\"data row0 col4\" >38</td>\n",
" <td id=\"T_2b14b_row0_col5\" class=\"data row0 col5\" >45</td>\n",
" <td id=\"T_2b14b_row0_col6\" class=\"data row0 col6\" >49</td>\n",
" <td id=\"T_2b14b_row0_col7\" class=\"data row0 col7\" >67</td>\n",
" </tr>\n",
" </tbody></table>"
],
"text/plain": [
"<pandas.io.formats.style.Styler at 0x11ac2e150>"
]
},
Chinese caracters not displayed for this article: http://10.240.4.179/en/article/fwpktfFtn5jm
In order to support the Chinese caracters:
Need to install the ctex package : tlmgr install ctex
tlmgr: Local TeX Live (2020) is older than remote repository (2023)
pdflatex --version (jdh)
pdfTeX 3.14159265-2.6-1.40.21 (TeX Live 2020)
Chinese caracters ok in Latex But the template of nbconvert doens't support it: notebook -(nbconvert)-> latex -(XeLatex)-> PDF
Include the font and the package in : base.tex.j2
Template are defined here: /Users/elisabeth.guerard/.pyenv/versions/anaconda3-2020.02/share/jupyter/nbconvert/templates/latex
\usepackage{ctex}
Don't know if i need to change in the jupyter_nbconvert_config.py
## Shell command used to compile latex.
# Default: ['xelatex', '{filename}', '-quiet']
c.PDFExporter.latex_command = ['/usr/local/texlive/2023/bin/universal-darwin/xelatex', '{filename}', '-quiet']
Workaround for the moment:
Generate the .ipynb with the citation inside by Celery task
(base) environment where pandoc is installed
Generate in latex via nbconvert
jupyter nbconvert --to latex notebook_with_ref.ipynb
In the latex insert the use of the following package:
\usepackage{ctex}
Outside of VSCode
xelatex notebook_with_ref.tex
About the installation of pandoc:
Try to use pandoc docker image https://github.com/pandoc/dockerfiles
Problem with the psycopg2-binary==2.8.6
49.14 Failed to build lxml psycopg2-binary
49.14 ERROR: Could not build wheels for lxml, psycopg2-binary, which is required to install pyproject.toml-based projects
------
See here: https://github.com/C2DH/journal-of-digital-history-backend/actions/runs/7103988496/job/19337950336
Try first to not use this binary: https://www.psycopg.org/docs/install.html
And for the moment generate it: ON DEMAND by waiting for working on the integration of pandoc image