Open rmcolq opened 8 years ago
Good spot on possible reason
The whole thing is horribly non-robust. The right thing to do is for run_indep.. to check the results of the previous step before carrying on. And yes, it should stop if that list is empty. Usually that might be ecause of vcftools not being in the path or some nonsense. But you should be able to look into the per-sample VCFs and see if there are any variants in them here. Maybe we need a tracking/ directory, and each parallel process writes a file to that dir on completion, saying whether it has completed correctly in its opinion. Then the main script can look in that dir and check - do all the completion files exist, and do they all contain OK content.
If you are concerned about being able to do this, transfer ownership of this issue to me, and I can do it tomorrow.
Yes please
Hi Guys, Remember I have been able to solve this by altering the code around line 75/76 (where parallel is implemented I believe) , look at my fork. Works like a charm for Ubuntu Linux
-Henk
THANKS and sorry Henk. Will look,
You're welcome! Happy to contribute. Have been making the change in the script every time a new version came out. Tell me if you need more info.
Sent from my iPhone
On Feb 1, 2016, at 10:08 AM, Zamin Iqbal notifications@github.com wrote:
THANKS and sorry Henk. Will look,
— Reply to this email directly or view it on GitHub.
So I think the problem is that the list_all_raw_vcfs file ends up empty. Because it exists, no error is thrown by combine_vcfs.pl and instead it hangs. I believe that the file list_all_raw_vcfs file is being created in the combine_vcfs.pl script as part of a subroutine 'checklist'. So need to add something to make it work (why was it empty when there should have been /vcfs/_wkraw created) or break properly if the file created is empty?