precimed / mixer

Causal Mixture Model for GWAS summary statistics

GNU General Public License v3.0

59 stars 17 forks source link

some problems for generating 1000G.EUR.QC.prune_maf0p05_rand2M_r2p8.repNN.snps in DATA PREPARATION #50

Open showers14 opened 3 years ago

showers14 commented 3 years ago

Hi,

I've tried to use mixer in Linux(CentOS 7.9), Q1：--bim-file LDSR/1000G_EUR_Phase3_plink/1000G.EUR.QC.@.bim \ --ld-file LDSR/1000G_EUR_Phase3_plink/1000G.EUR.QC.@.run4.ld \ i don't know the meaning of "@" in the code provided by Tutorials？

Q2： I have a problem with ${SLURM_ARRAY_TASK_ID} and our Linux system doesn't have SLURM.

Q3: i don't know "--seed ${SLURM_ARRAY_TASK_ID}" why it need set seed here?

so i changed the code in Tutorials and hope you could help me check out the following code:

!/bin/sh

for i in {1..20} do python3 /home/ychen/Downloads/mixer/precimed/mixer.py snps \ --lib /home/ychen/Downloads/mixer/src/build/lib/libbgmg.so \ --bim-file /home/ychen/Downloads/1000G_EUR_Phase3_plink/1000G.EUR.QC.$i \ --ld-file /home/ychen/Downloads/1000G_EUR_Phase3_plink/1000G.EUR.QC.$i.run4.ld \ --out LDSR/1000G_EUR_Phase3_plink/1000G.EUR.QC.prune_maf0p05_rand2M_r2p8.rep$i.snps \ --maf 0.05 --subset 2000000 --r2 0.8 --seed $i done Thank you in advance for your help!

ofrei commented 3 years ago

Q1: as explained in the README file:

Note that in the code above the @ symbol does NOT need to be replace with an actual chromosome. It should stay as @ in your command.

Your command needs to be corrected. Use @ symbol, not $i.

Q2 Your change is fine, but for loop will run for a very long time, each of the 20 iterations might take a day or so to finish. If you have other cluster (not SLURM but e.g. SGE) it should be possible to change the code (e.g. submit 20 scripts each with its own $i)

Q3. The seed is set so that each iteration follows its own path of random numbers generator. There is a stochastic element in the MiXeR optimization procedure, and it's good to capture the (small) variation in parameter estimates that arise to to this stochastisity.

showers14 commented 3 years ago

thanks a lot i have another problems: IN Build from source - Linux PART

If you work in HPC environment with modules system, you can load some existing combination of modules that include Boost libraries: module load CMake/3.15.3-GCCcore-8.3.0 Boost/1.73.0-GCCcore-8.3.0 Python/3.7.4-GCCcore-8.3.0 # TSD (gcc) module load Boost/1.71.0-GCC-8.3.0 Python/3.7.4-GCCcore-8.3.0 CMake/3.12.1 # SAGA (gcc)
module load Boost/1.68.0-intel-2018b-Python-3.6.6 Python/3.6.6-intel-2018b CMake/3.12.1 # SAGA (intel)

i don't know how to choose one from TSD (gcc), SAGA (gcc) and SAGA (intel) how to use LINUX TERMINAL to pick the module i need what mean TSD and SAGA? is it a processor? is it gcc and intel? is it compiler?

Thank you in advance for your help!

ofrei commented 3 years ago

@showers14 take a look at the examples here https://github.com/comorment/containers/tree/main/usecases there you don't need to build MiXeR from sources as it's packaged into singularity containers