zhengxwen / SeqArray

Data management of large-scale whole-genome sequence variant calls (Development version only)
http://www.bioconductor.org/packages/SeqArray
43 stars 12 forks source link

seqParallel with BiocParallel::BatchtoolsParam object ? #66

Open ldcato opened 3 years ago

ldcato commented 3 years ago

Hi, amazing resource of tools built here!

I have a question regarding the seqParallel function. I can see the cl arguement takes a BiocParallelParam object. I attempted to use it with a BatchtoolsParam object within the BiocParallel package, I don't get error messages but it doesn't scale and just runs 1 chunk on one node.

Submitting 1 jobs in 1 chunks using cluster functions 'SGE' ...
Waiting (Q:0 R:1 D:0 E:0 ?:0) [--------------------------------]   0% eta:  ?s

Are there any ambitions to include BatchtoolsParam support? Am I missing something that would allow this to work in the first place?


This all came about because I wanted to use the seqVCF2GDS() function with large VCF files, and set the parallel argument equal to my BatchtoolsParam object. seqVCF2GDS(vcf.fn='an_examplefile.vcf', out.fn='test.gds', header=NULL, storage.option="LZMA_RA", info.import=NULL, fmt.import=NULL, genotype.var.name="GT", ignore.chr.prefix="chr", scenario="general", reference='GRCh38', start=1L, count=-1L, optimize=TRUE, raise.error=TRUE, digest=TRUE, **parallel=param**, verbose=TRUE) ^ my BatchtoolsParam object is named param.

In case it's important:


Thank you for any help! Liam

davidroberson commented 4 months ago

nudging this almost four years later.... I see it is still open.

zhengxwen commented 4 months ago

Sorry! I have forgot this issue during the pandemic. It is on my radar now.

Xiuwen