rstudio / r-system-requirements

System requirements for R packages
MIT License
125 stars 23 forks source link

Fix Poppler C++ regex for pdftools #189

Closed glin closed 1 month ago

glin commented 1 month ago

\b word boundaries don't apply if the end of the word is a character like +. \B matches non-word boundaries, but we don't really need any boundary here.

> 'Poppler C++ API'.match(/\bPoppler C\+\+\b/i)
null

> 'Poppler C++ API'.match(/\bPoppler C\+\+\B/i)
[
  'Poppler C++',
  index: 0,
  input: 'Poppler C++ API',
  groups: undefined
]

> 'Poppler C++ API'.match(/\bPoppler C\+\+/i)
[
  'Poppler C++',
  index: 0,
  input: 'Poppler C++ API',
  groups: undefined
]
glin commented 1 month ago

cc @jeroen for review

jeroen commented 1 month ago

Thanks for debugging this, LGTM!

I personally think this regex overcomplicates it really, why not just use /poppler/i. The only edge cases I can think of is that this may install this c++ library also for R packages that uses other poppler tools, which is harmless.

glin commented 1 month ago

Yeah, that was exactly why since we recently added a rule for poppler-glib that only applied to Rpoppler. While poppler-cpp only applied to pdftools. There was no other package that only specified "poppler". https://github.com/rstudio/r-system-requirements/pull/179

jeroen commented 1 month ago

OK great, I confirm this fixed the CI of pdftools!