import_sumstats: Works fine for 100s of GWAS, then encounters this error and quickly iterates through all remaining GWAS ids without actually processing them (and, strangely, appends their log files to that of the one that first encountered the error!).
This takes a very long time to actually reproduce (multiple days of running continuously). And it's not like the GWAS that was being analyzed at the time of the error was particular large or anything ("only" 11M SNPs).
Possible explanations
Multiple users on our private cloud are accidentally trying to use the same threads at the same time, and BiocParallel can't handle this gracefully?
The virtual machine becomes temporarily disconnected from its dedicated resources. Perhaps a question for @eduff
data.table is trying to run in parallel within each loop of read_vcf_parallel (which is also being run in parallel), causing a conflict with the same cores being requested for different tasks at once. Though I don't know why this wouldn't happen far earlier when processing 100s of GWAS.
read_vcf_parallel:
It seems to occur at read_vcf_parallel. This function seems to be rather finicky as it also doesn't like it when I specify >30 threads, though I suspect that's for a different reason (splitting a VCF across too many threads means that if some genome tiles are empty, the whole loop breaks, perhaps at the final re-merging step).
Also, not sure if I'm the only one, but BiocParallel can be a bit trickier to use successfully.
Console output
Using local VCF.
File already tabix-indexed.
Finding empty VCF columns based on first 10,000 rows.
Dropping 1 duplicate columns.
1 sample detected: ubm-a-129
Constructing ScanVcfParam object.
VCF contains: 11,734,353 variant(s) x 1 sample(s)
Reading VCF file: multi-threaded (30 threads)
failed to open the port 11221, trying a new port...
failed to open the port 11596, trying a new port...
failed to open the port 11982, trying a new port...
failed to open the port 11329, trying a new port...
failed to open the port 11700, trying a new port...
cannot find an open port. For manually specifying the port, see ?SnowParamUsing previously downloaded VCF.
Formatted summary statistics will be saved to ==> /shared/bms20/projects/MAGMA_Files_Public/data/GWAS_sumstats/ubm-a-81/ubm-a-81.tsv.gz
Log data to be saved to ==> /shared/bms20/projects/MAGMA_Files_Public/data/GWAS_sumstats/ubm-a-81/logs
Saving output messages to:
/shared/bms20/projects/MAGMA_Files_Public/data/GWAS_sumstats/ubm-a-81/logs/MungeSumstats_log_msg.txt
Any runtime errors will be saved to:
/shared/bms20/projects/MAGMA_Files_Public/data/GWAS_sumstats/ubm-a-81/logs/MungeSumstats_log_output.txt
Messages will not be printed to terminal.
all connections are in useUsing previously downloaded VCF.
Formatted summary statistics will be saved to ==> /shared/bms20/projects/MAGMA_Files_Public/data/GWAS_sumstats/ubm-a-93/ubm-a-93.tsv.gz
Log data to be saved to ==> /shared/bms20/projects/MAGMA_Files_Public/data/GWAS_sumstats/ubm-a-93/logs
Saving output messages to:
/shared/bms20/projects/MAGMA_Files_Public/data/GWAS_sumstats/ubm-a-93/logs/MungeSumstats_log_msg.txt
Any runtime errors will be saved to:
/shared/bms20/projects/MAGMA_Files_Public/data/GWAS_sumstats/ubm-a-93/logs/MungeSumstats_log_output.txt
Messages will not be printed to terminal.
...
...
...
1. Bug description
import_sumstats
: Works fine for 100s of GWAS, then encounters this error and quickly iterates through all remaining GWAS ids without actually processing them (and, strangely, appends their log files to that of the one that first encountered the error!).This takes a very long time to actually reproduce (multiple days of running continuously). And it's not like the GWAS that was being analyzed at the time of the error was particular large or anything ("only" 11M SNPs).
Possible explanations
BiocParallel
can't handle this gracefully?data.table
is trying to run in parallel within each loop ofread_vcf_parallel
(which is also being run in parallel), causing a conflict with the same cores being requested for different tasks at once. Though I don't know why this wouldn't happen far earlier when processing 100s of GWAS.read_vcf_parallel
:It seems to occur at
read_vcf_parallel
. This function seems to be rather finicky as it also doesn't like it when I specify >30 threads, though I suspect that's for a different reason (splitting a VCF across too many threads means that if some genome tiles are empty, the whole loop breaks, perhaps at the final re-merging step).Related Issues
BiocParallel
:Also, not sure if I'm the only one, but
BiocParallel
can be a bit trickier to use successfully.Console output
Full logs file: ubm-a-129_log_msg.txt
Expected behaviour
Process all sumstats.
2. Reproducible example
Code
3. Session info