shyamd / mkdocs-bibtex

A MkDocs plugin for citation management using bibtex
Other
75 stars 21 forks source link

Escape problem with python 3.10 and material theme (insiders version) #234

Closed reyman closed 6 months ago

reyman commented 6 months ago

Hi,

I'm trying to use mkdocs-bibtex with latest theme "material" as insider version.

My mkdocs.yml

site_name: Les cahiers d'IDEES
theme:
  name: material
  logo: assets/logo.png
  language: fr
  palette:
    primary: white
    accent: blue

markdown_extensions:
  - footnotes

plugins:
  - blog
  - tags
  - meta
  - git-revision-date-localized
  - search
  - bibtex:
      bib_file: "docs/bib/notebooks.bib"
      cite_style: "pandoc"
      bib_by_default: true

I have an espace error during compilation :

$ mkdocs build -v -f mkdocs-ci.yml
DEBUG   -  Loading configuration file: <_io.BufferedReader name='mkdocs-ci.yml'>
DEBUG   -  Loaded theme configuration for 'material' from '/usr/local/lib/python3.10/site-packages/material/templates/mkdocs_theme.yml': {'language': 'en', 'direction': None, 'features': [], 'font': {'text': 'Roboto', 'code': 'Roboto Mono'}, 'icon': None, 'favicon': 'assets/images/favicon.png', 'static_templates': ['404.html']}
INFO    -  DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
  File "/usr/local/lib/python3.10/site-packages/pybtex/plugin/__init__.py", line 26, in <module>
    import pkg_resources
  File "/usr/local/lib/python3.10/site-packages/pkg_resources/__init__.py", line 102, in <module>
    warnings.warn(
INFO    -  DeprecationWarning: warning_filter doesn't do anything since MkDocs 1.2 and will be removed soon. All messages on the `mkdocs` logger get counted automatically.
  File "/usr/local/lib/python3.10/site-packages/mkdocs_bibtex/utils.py", line 10, in <module>
    from mkdocs.utils import warning_filter
  File "/usr/local/lib/python3.10/site-packages/mkdocs/utils/__init__.py", line 453, in __getattr__
    warnings.warn(
WARNING -  Config value 'plugins': Plugin 'bibtex' option 'cite_style': Unrecognised configuration name: cite_style
DEBUG   -  Config value 'config_file_path' = 'mkdocs-ci.yml'
DEBUG   -  Config value 'site_name' = "Les cahiers d'IDEES"
DEBUG   -  Config value 'nav' = None
DEBUG   -  Config value 'pages' = None
DEBUG   -  Config value 'exclude_docs' = None
DEBUG   -  Config value 'not_in_nav' = None
DEBUG   -  Config value 'site_url' = None
DEBUG   -  Config value 'site_description' = None
DEBUG   -  Config value 'site_author' = None
DEBUG   -  Config value 'theme' = Theme(name='material', dirs=['/usr/local/lib/python3.10/site-packages/material/templates', '/usr/local/lib/python3.10/site-packages/mkdocs/templates'], static_templates={'sitemap.xml', '404.html'}, name='material', locale=Locale('en'), language='fr', direction=None, features=[], font={'text': 'Roboto', 'code': 'Roboto Mono'}, icon=None, favicon='assets/images/favicon.png', logo='assets/logo.png', palette={'primary': 'white', 'accent': 'blue'})
DEBUG   -  Config value 'docs_dir' = '/builds/umr-idees/formations-cahier-idees/cahier-idees-metasite/docs'
DEBUG   -  Config value 'site_dir' = '/builds/umr-idees/formations-cahier-idees/cahier-idees-metasite/site'
DEBUG   -  Config value 'copyright' = None
DEBUG   -  Config value 'google_analytics' = None
DEBUG   -  Config value 'dev_addr' = _IpAddressValue(host='127.0.0.1', port=8000)
DEBUG   -  Config value 'use_directory_urls' = True
DEBUG   -  Config value 'repo_url' = None
DEBUG   -  Config value 'repo_name' = None
DEBUG   -  Config value 'edit_uri_template' = None
DEBUG   -  Config value 'edit_uri' = None
DEBUG   -  Config value 'extra_css' = []
DEBUG   -  Config value 'extra_javascript' = []
DEBUG   -  Config value 'extra_templates' = []
DEBUG   -  Config value 'markdown_extensions' = ['toc', 'tables', 'fenced_code', 'footnotes']
DEBUG   -  Config value 'mdx_configs' = {}
DEBUG   -  Config value 'strict' = False
DEBUG   -  Config value 'remote_branch' = 'gh-pages'
DEBUG   -  Config value 'remote_name' = 'origin'
DEBUG   -  Config value 'extra' = {}
DEBUG   -  Config value 'plugins' = {'material/blog': <material.plugins.blog.plugin.BlogPlugin object at 0x7f9a3600df30>, 'material/tags': <material.plugins.tags.plugin.TagsPlugin object at 0x7f9a36040370>, 'material/meta': <material.plugins.meta.plugin.MetaPlugin object at 0x7f9a36041ba0>, 'git-revision-date-localized': <mkdocs_git_revision_date_localized_plugin.plugin.GitRevisionDateLocalizedPlugin object at 0x7f9a36042380>, 'material/search': <material.plugins.search.plugin.SearchPlugin object at 0x7f9a35ce4a00>, 'bibtex': <mkdocs_bibtex.plugin.BibTexPlugin object at 0x7f9a35b48580>}
DEBUG   -  Config value 'hooks' = {}
DEBUG   -  Config value 'watch' = []
DEBUG   -  Config value 'validation' = {'nav': {'omitted_files': 20, 'not_found': 30, 'absolute_links': 20}, 'links': {'not_found': 30, 'absolute_links': 20, 'unrecognized_links': 20}}
DEBUG   -  Running 2 `startup` events
DEBUG   -  Running 5 `config` events
WARNING:root:
                [git-revision-date-localized-plugin] Running on a GitLab runner might lead to wrong
                Git revision dates due to a shallow git fetch depth.
                Make sure to set GIT_DEPTH to 0 in your .gitlab-ci.yml file
                (see https://docs.gitlab.com/ee/user/project/pipelines/settings.html#git-shallow-clone).

DEBUG   -  Looking for translations for locale 'en'
DEBUG   -  No translations found here: '/usr/local/lib/python3.10/site-packages/mkdocs/templates/locales'
DEBUG   -  No translations found here: '/usr/local/lib/python3.10/site-packages/material/templates/locales'
DEBUG   -  Looking for translations for locale 'en'
DEBUG   -  No translations found here: '/usr/local/lib/python3.10/site-packages/mkdocs/templates/locales'
DEBUG   -  No translations found here: '/usr/local/lib/python3.10/site-packages/material/templates/locales'
DEBUG   -  Looking for translations for locale 'en'
DEBUG   -  No translations found here: '/usr/local/lib/python3.10/site-packages/mkdocs/templates/locales'
DEBUG   -  No translations found here: '/usr/local/lib/python3.10/site-packages/material/templates/locales'
DEBUG   -  Parsing bibtex file 'docs/bib/notebooks.bib'...
INFO    -  SUCCESS Parsing bibtex file 'docs/bib/notebooks.bib'
INFO    -  Cleaning site directory
INFO    -  Building documentation to directory: /builds/umr-idees/formations-cahier-idees/cahier-idees-metasite/site
DEBUG   -  Looking for translations for locale 'en'
DEBUG   -  No translations found here: '/usr/local/lib/python3.10/site-packages/mkdocs/templates/locales'
DEBUG   -  No translations found here: '/usr/local/lib/python3.10/site-packages/material/templates/locales'
DEBUG   -  Running 2 `files` events
DEBUG   -  Running 2 `nav` events
DEBUG   -  Looking for translations for locale 'en'
DEBUG   -  No translations found here: '/usr/local/lib/python3.10/site-packages/mkdocs/templates/locales'
DEBUG   -  No translations found here: '/usr/local/lib/python3.10/site-packages/material/templates/locales'
DEBUG   -  Looking for translations for locale 'en'
DEBUG   -  No translations found here: '/usr/local/lib/python3.10/site-packages/mkdocs/templates/locales'
DEBUG   -  No translations found here: '/usr/local/lib/python3.10/site-packages/material/templates/locales'
DEBUG   -  Reading markdown pages.
DEBUG   -  Reading: index.md
DEBUG   -  Running 5 `page_markdown` events
DEBUG   -  Formatting all bib entries...
INFO    -  SUCCESS Formatting all bib entries
DEBUG   -  Replacing citation keys with the generated ones...
DEBUG   -  SUCCESS Replacing citation keys with the generated ones
DEBUG   -  Running 1 `page_content` events
DEBUG   -  Reading: blog/index.md
DEBUG   -  Running 5 `page_markdown` events
WARNING:root:
                [git-revision-date-localized-plugin] Running on a GitLab runner might lead to wrong
                Git revision dates due to a shallow git fetch depth.
                Make sure to set GIT_DEPTH to 0 in your .gitlab-ci.yml file
                (see https://docs.gitlab.com/ee/user/project/pipelines/settings.html#git-shallow-clone).

DEBUG   -  Formatting all bib entries...
INFO    -  SUCCESS Formatting all bib entries
DEBUG   -  Replacing citation keys with the generated ones...
DEBUG   -  SUCCESS Replacing citation keys with the generated ones
DEBUG   -  Running 1 `page_content` events
DEBUG   -  Reading: formations/enquete-et-entretiens.md
DEBUG   -  Running 5 `page_markdown` events
WARNING:root:
                [git-revision-date-localized-plugin] Running on a GitLab runner might lead to wrong
                Git revision dates due to a shallow git fetch depth.
                Make sure to set GIT_DEPTH to 0 in your .gitlab-ci.yml file
                (see https://docs.gitlab.com/ee/user/project/pipelines/settings.html#git-shallow-clone).

DEBUG   -  Formatting all bib entries...
INFO    -  SUCCESS Formatting all bib entries
DEBUG   -  Replacing citation keys with the generated ones...
DEBUG   -  SUCCESS Replacing citation keys with the generated ones
DEBUG   -  Running 1 `page_content` events
DEBUG   -  Reading: formations/r-initiations.md
DEBUG   -  Running 5 `page_markdown` events
DEBUG   -  Formatting all bib entries...
INFO    -  SUCCESS Formatting all bib entries
DEBUG   -  Replacing citation keys with the generated ones...
DEBUG   -  SUCCESS Replacing citation keys with the generated ones
DEBUG   -  Running 1 `page_content` events
DEBUG   -  Reading: formations/r-textometrie.md
DEBUG   -  Running 5 `page_markdown` events
DEBUG   -  Formatting all bib entries...
DEBUG   -  Converting bibtex entry 'Toureille2024' without pandoc
DEBUG   -  SUCCESS Converting bibtex entry 'Toureille2024' without pandoc
INFO    -  SUCCESS Formatting all bib entries
DEBUG   -  Replacing citation keys with the generated ones...
DEBUG   -  SUCCESS Replacing citation keys with the generated ones
ERROR   -  Error reading page 'formations/r-textometrie.md': bad escape \u at position 156
DEBUG   -  Running 1 `shutdown` events
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/sre_parse.py", line 1051, in parse_template
    this = chr(ESCAPES[this][1])
KeyError: '\\u'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/bin/mkdocs", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/mkdocs/__main__.py", line 286, in build_command
    build.build(cfg, dirty=not clean)
  File "/usr/local/lib/python3.10/site-packages/mkdocs/commands/build.py", line 322, in build
    _populate_page(file.page, config, files, dirty)
  File "/usr/local/lib/python3.10/site-packages/mkdocs/commands/build.py", line 171, in _populate_page
    page.markdown = config.plugins.on_page_markdown(
  File "/usr/local/lib/python3.10/site-packages/mkdocs/plugins.py", line 575, in on_page_markdown
    return self.run_event('page_markdown', markdown, page=page, config=config, files=files)
  File "/usr/local/lib/python3.10/site-packages/mkdocs/plugins.py", line 507, in run_event
    result = method(item, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/mkdocs_bibtex/plugin.py", line 145, in on_page_markdown
    markdown = re.sub(
  File "/usr/local/lib/python3.10/re.py", line 209, in sub
    return _compile(pattern, flags).sub(repl, string, count)
  File "/usr/local/lib/python3.10/re.py", line 326, in _subx
    template = _compile_repl(template, pattern)
  File "/usr/local/lib/python3.10/re.py", line 317, in _compile_repl
    return sre_parse.parse_template(repl, pattern)
  File "/usr/local/lib/python3.10/sre_parse.py", line 1054, in parse_template
    raise s.error('bad escape %s' % this, len(this))
re.error: bad escape \u at position 156

The file formations/r-textometrie.md :

---
title: La Statistique Textuelle avec R
icon:
---

# La « Statistique Textuelle » avec R 

Ce jour et demi de formation aura pour objectif de reprendre les fondamentaux de de la statistique textuelle (Lebart et Salem, 1994) à travers leur implémentation dans R via le package quanteda (Watanabe et Müller, 2023).

Nous avons fait le choix d'un format mixte :

- 1 journée de Formation et 
- 1/2 journée d'Atelier afin que les participants puissent venir tester les méthodes avec leur propre corpus.

Cette formation s’adresse à toute personne confrontée par la manipulation de données textuelles (données d’enquête, archives, chaînes de caractères scrapées, etc.) intéressée par une analyse quantitative exploratoire du contenu. L’avantage d’une implémentation dans R, réside dans la possibilité de reproduire des analyses disponibles dans des logiciels historiques de la statistique textuelle souvent payants (Alceste, module d’analyse lexical de SPAD) ou dans les possibilités de répétabilité, d’automatisation d’opérations existantes dans d’autres logiciels d’analyse textuelle (cas de TXM).

Pour une application en sciences sociales le contenu des textes sera analysé relativement à leur contexte de production (différences lexicales entre sources, auteurs, périodes, par exemple).

Des rappels théoriques et des applications sur des corpus d’exemple seront proposées sur les méthodes suivantes :

- La transformation de textes (chaînes de caractères) en tableau (tableau lexical entier : TLE),
- Donc la modélisation du tableau de données (définition des unités de contexte et des unités lexicales, sélection des termes et autres formes de « nettoyage » des données textuelles),
- L’analyse du lexique et du concordancier,
- L’analyse du vocabulaire spécifique entre des sous-corpus (construction de tableaux lexicaux agrégés – TLA et test du chi2),
- L’analyse des correspondances (AFC) appliquée à un TLA,
- L’analyse des cooccurrences et des segments répétés,
- Selon le temps disponible : méthodes de classification de texte (CAH et/ou CDH de Reinert – méthode historiquement utilisée dans Alceste et Iramutec – via le package rainette – Barnier, 2023).

# Historique des formations

- 5 et 6 février 2024 au Havre (initiation), [Etienne Toureille](https://umr-idees.fr/laboratoire/annuaire/etienne-toureille), créateur de la formation initiale

# Citation

[@Toureille2024]

With :

reyman commented 6 months ago

Some more insight, i found the problem, this is linked to url into bib file :

Generate an error :

@article{Toureille2024,
  title={Introduction à la Statistique Textuelle avec R},
  author={Toureille Etienne},
  journal={Cahier Idées},
  year={2024},
  publisher={UMR IDEES},
  url = {\url{https://doi.org/10.21577/0103-5053.20190253}}
}

Without url, don't generate an error :

@article{Toureille2024,
  title={Introduction à la Statistique Textuelle avec R},
  author={Toureille Etienne},
  journal={Cahier Idées},
  year={2024},
  publisher={UMR IDEES},
}
shyamd commented 6 months ago

There's two things going on here:

  1. This is definitely a bug. mkdocs-bibtex should be properly dealing with the escapes and it's not.
  2. Even if it did work, you wouldn't get what you wanted. The returned bibliography from the internal styling will properly use URLs from the bibtex, but with your bib entry it will show up as:
    [^1]: Toureille Etienne. Introduction à la statistique textuelle avec r. *Cahier Idées*, 2024. URL: [\\url\{https://doi.org/10.21577/0103\-5053.20190253\}](\url{https://doi.org/10.21577/0103-5053.20190253}).

    This won't get you the clickable URL you want.

The workaround is to use something like this in your bib file:

url = {https://doi.org/10.21577/0103-5053.20190253}

This should yield properly rendered URLs in your markdown.

On a related note, cite_style isn't an option. If you provide a CSL style file to csl_file, mkdocs-bibtex will then use Pandoc. Otherwise it uses a very simplistic style format.