Closed valentinedwv closed 6 years ago
David, I need a little bit of the information regarding the context of this pull request, like:
Looking forward for your input.
If a harvest has 10's to 100'k records, storing in a single folder is a large performance hit, aka slow performance to list all files.
Limiting # files in folder to < 10k solves this issue.
hold on, not fully working.
Sanitized file names
ready
Looks like file LargeDataSetDirectoryAssigner class is/was part of another project authored by bozyurt on 12/1/15. Is there a reference to that project? What is the license for that code?
Is part of our processing pipeline: https://github.com/CINERGI/Foundry/blob/master/LICENSE.md
Thanks for the contribution, will it be possible to provide it under apache 2.0 license or BSD or MIT? As the product will be used by commercial users as well, the following language in the license would be a concern:
Permission to make commercial use of this software may be obtained by contacting: Technology Transfer Office 9500 Gilman Drive, Mail Code 0910 University of California La Jolla, CA 92093-0910 (858) 534-5815invent@ucsd.edu
Thanks,
Apache 2 is fine
great, thanks!
Split into smaller folders when harvest is large