brentp / smoove

structural variant calling and genotyping with existing tools, but, smoothly.
Apache License 2.0
222 stars 21 forks source link

Splitting the sample across node for parallel ? #159

Open Navin-techi opened 2 years ago

Navin-techi commented 2 years ago

@brentp Tried splitting the sample across the nodes using python mpi4py to do the task but on a single node/cross node I do find only one sample is running across the node a help would be best to sort this issue out.

used command for population : **_x = [ 's1', 's2', 's3'] def wper (int): os .system( "time smoove call --outdir /gpfs/data/user/n_smoove/output --exclude /gpfs/data/user/n_smoove/exclude_region/exclude.cnvnator_101bp.GRCh38.20170403.bed --name "+x[int]+" --fasta /gpfs/data/user/n_smoove/ref-gen/GRCh38_full_analysis_set_plus_decoy_hla.fa -p 1 --genotype /gpfs/data/user/n_smoove/cram/"+x[int]+".cram")

comm = MPI.COMM_WORLD rank = comm.Getrank() if rank < 4: wper(rank)**

brentp commented 2 years ago

Hi, you can certainly parallelize by sample, that's how smoove is designed. I don't understand enough about MPI to help you do that. You can also use smoove-nf to handle this parallelization for you.

Navin-techi commented 2 years ago

is there a way to use gnu-parallel for this?

brentp commented 2 years ago

absolutely. you can use any tool you like to do the parallelization.