Allows for partitioning work at the file level - checking the hash of the de-globbed file names instead of spans. Also, creates an optional 'document dir' parameter for dealing with source data that doesn't have a 'document' directory to replace with 'attributes' (previously the attribute path generator would try replacing 'documents', not find any, and use the source directory!)
Allows for partitioning work at the file level - checking the hash of the de-globbed file names instead of spans. Also, creates an optional 'document dir' parameter for dealing with source data that doesn't have a 'document' directory to replace with 'attributes' (previously the attribute path generator would try replacing 'documents', not find any, and use the source directory!)