librariesio / bibliothecary

:notebook_with_decorative_cover: Libraries.io Package Manager Manifest Parsers
https://libraries.io/rubygems/bibliothecary
GNU Affero General Public License v3.0
89 stars 36 forks source link

Bibliothecary.analyze(): skip analysis of binary files so we don't try to read invalid utf8 #591

Closed tiegz closed 6 months ago

tiegz commented 6 months ago

after forcing UTF-8 on all files before we read them in https://github.com/librariesio/bibliothecary/pull/565 , it was revealed that a pip-compile regexp we have is very generic and catching everything with "require" in the filename, which causes Bibliothecary.analyze() to blow up on binary files.

This rescues the encoding error and skips analysis, although a better fix might be to detect binary files and skip reading them altogether (and to just fix that regexp so it's less generic).

closes https://github.com/librariesio/bibliothecary/issues/590