Closed molikd closed 2 years ago
This should be in working order now.
Sorry I couldn't merge this earlier. Using xargs
was a good suggestion, the only issue is that the vcf headers must maintain the same order or later steps will break.
I'll close this since there's been some major refactoring. I'll be more responsive to pull requests in the future. Happy to discuss.
This pull request, when tested and ready, will use GNU parallel (already in the conda environment) to multithread the greps for the vcf files, it mutlithreads at the pipe level instead of at the file level (so it splits vcf files into chunks of lines and greps those simultaneously). GNU Parallel has a nice feature of not interrupting line writes.