tldr-pages / tldr-translation-pairs-gen

Generates a structured dataset in various formats derived from tldr-pages.
https://opus.nlpl.eu/tldr-pages/corpus/version/tldr-pages
MIT License
4 stars 3 forks source link

fix: don't crash on unexpected directory #14

Closed SethFalco closed 1 year ago

SethFalco commented 1 year ago

In the tldr repo, we accidentally merged a PR with the wrong file structure.

This led tldr-translations-pairs-gen to crash. To avoid this from happening again, I added a verifyIntegrity function where can perform any checks on the file and make sure we're happy to process it. Currently, it just checks if it's a directory that logs a warning and skips it if so.

Related