When rebuilding the data for passim, it makes sense to filter data to rebuild by language (as text its reuse detection does not work across languages).
add a CLI parameter for the desired language
filter out (via dask bag) all content items that are not in that language
When rebuilding the data for
passim
, it makes sense to filter data to rebuild by language (as text its reuse detection does not work across languages).