A beginner's question regarding the tutorial code

RuochengDong commented 3 years ago

Hi,

I am following your tutorial code to use the DBSLMM.

dbslmm=/your/path/DBSLMM/software/dbslmm

I had some trouble using it. If I used this one, I will get an error: "/your/path/DBSLMM/software/dbslmm does not exist! Please check!"

I also tried:

dbslmm=/your/path/DBSLMM/software/DBSLMM

still got the same error.

Tried this one:

dbslmm=/your/path/DBSLMM/software/DBSLMM.R

then the error became:

: No such file or directory
Warning message:
In system(paste0(opt$dbslmm, " -s ", opt$summ, " -r ", opt$ref,  :
  error in running command

I actually confused about what should I put in the argument --dbslmm According to the help, it should be the prefix of the software, but even I put /your/path/DBSLMM/software/DBSLMM directly like this --dbslmm /your/path/DBSLMM/software/DBSLMM, it did not work.

biostat0903 commented 3 years ago

Hi Roucheng, Thanks for your consideration for DBSLMM. I am quite sorry for the confusion of manual. The --dbslmm means the executive file. You have two ways to obtain it. First, you can use scr/Makefile to compile it on your server. Second, you can download the static version on the google drive (I give the download link in the README.md file). When you click the link, I will receive an e-mail. Then you can download the software. If you have any problem, please feel free to ask me. Best, Sheng

RuochengDong commented 3 years ago

Hi Sheng,

Thanks for your timely response. I downloaded the executive file and the code worked without error anymore.

But unfortunately, I did not get any output after I ran the code.

Warning: Ignoring phenotypes of missing-sex samples.  If you don't want those
phenotypes to be ignored, use the --allow-no-sex flag.
Warning: 'rs560740897' is missing from the main dataset, and is a top variant.
Warning: 'rs200588723' is missing from the main dataset, and is a top variant.
Warning: 'rs751721382' is missing from the main dataset, and is a top variant.
154 more top variant IDs missing; see log file.
Using 1e-6, 25 SNPs are regarded as fixed effect.
Clumping time:  0.785 s.
400 individuals to be included from reference FAM file.
Calculating MAF of reference panel ...
723 SNPs to be included from reference BIM file.

I checked the out folder, there is nothing. My code is as follows, I modified it a little bit from the manual.

dbslmm="/xxx/xxx/DBSLMM/software/dbslmm"
chmod 777 ${dbslmm}

### Parameters for DBSLMM

DBSLMM="/xxx/xxx/DBSLMM/software/DBSLMM.R"
summf="/xxx/xxx/DBSLMM/test_dat/summary_gemma_chr1.assoc.txt"
outPath="/xxx/xxx/DBSLMM/test_dat/out/"
plink="/xxx/xxx/plink"
ref="/xxx/xxx/DBSLMM/test_dat/ref_chr1"
blockf="/xxx/xxx/DBSLMM/test_dat/chr1.bed"

m=`cat ${summf}| wc -l` 
h2=0.5
nobs=`sed -n "2p" ${summf}| awk '{print $5}'`
nmis=`sed -n "2p" ${summf}| awk '{print $4}'`
n=$(echo "${nobs}+${nmis}" | bc -l)

Rscript ${DBSLMM} --summary ${summf} \
--outPath ${outPath} \
--plink ${plink} \
--dbslmm ${dbslmm} \
--model DBSLMM \
--ref ${ref} \
--n ${n} \
--nsnp ${m} \
--block ${blockf} \
--h2 0.5 \
--thread 1

Do you have any idea? Thank you so much for helping me.

biostat0903 commented 3 years ago

Hi Ruocheng, I think you can revise the assignment of dbslmm: dbslmm=/xxx/xxx/DBSLMM/software/dbslmm.

Best, Sheng

RuochengDong commented 3 years ago

Hi Ruocheng, I think you can revise the assignment of dbslmm: dbslmm=/xxx/xxx/DBSLMM/software/dbslmm.

Best, Sheng

I revised it. But still got nothing.

biostat0903 commented 3 years ago

Hi Ruocheng, I am so sorry for the inconvenience. Please check whether the block file is separated by tab and without the first line :

chr1    10583   1892607
chr1    1892607 3582736
chr1    3582736 4380811
chr1    4380811 5913893
chr1    5913893 7247335
chr1    7247335 9365199
chr1    9365199 10806984

Best, Sheng

biostat0903 commented 3 years ago

Is everything ok?

RuochengDong commented 3 years ago

Hi Ruocheng, I am so sorry for the inconvenience. Please check whether the block file is separated by tab and without the first line :
chr1  10583   1892607
chr1  1892607 3582736
chr1  3582736 4380811
chr1  4380811 5913893
chr1  5913893 7247335
chr1  7247335 9365199
chr1  9365199 10806984
Best, Sheng

Hi Sheng,

I actually downloaded the block bed file directly from here https://bitbucket.org/nygcresearch/ldetect-data/src/master/. Should I reformat it?

biostat0903 commented 3 years ago

Hi Ruocheng, I have uploaded the block file using DBSLMM format at https://github.com/biostat0903/DBSLMM/tree/master/block_data. You can use the file. If there are still some error, please let me know.

Best, Sheng

biostat0903 commented 3 years ago

Hi Ruocheng, Everything is ok ? Best, Sheng

RuochengDong commented 3 years ago

Hi Shen,

Sorry I was in a conference last week and did not get a chance to try it.

I used your data to run the test code and the model seems to work successfully. However, I got two files from the out path. One is the" summary_gemma_chr1.dbslmm.badsnps" and another one is "summary_gemma_chr1.dbslmm", which is empty.

sh DBSLMM_test.sh
Warning: Ignoring phenotypes of missing-sex samples.  If you don't want those 
phenotypes to be ignored, use the --allow-no-sex flag.
Warning: 'rs560740897' is missing from the main dataset, and is a top variant.
Warning: 'rs200588723' is missing from the main dataset, and is a top variant.
Warning: 'rs751721382' is missing from the main dataset, and is a top variant.
154 more top variant IDs missing; see log file.
Using 1e-6, 25 SNPs are regarded as fixed effect.
Clumping time:  0.242 s.
400 individuals to be included from reference FAM file.
Calculating MAF of reference panel ...
723 SNPs to be included from reference BIM file.
Reading summary data of small effect SNPs from [/project/EngelmanGroup/xxx/xxx/DBSLMM/test_dat/out/s_summary_gemma_chr1.txt]
Number of SNP missing: 274
Number of allele discrepency: 3
Number of maf discrepency:    1
After filtering, 0 small effect SNPs are selected.
Reading summary data of large effect SNPs from [/xxx/xxx/DBSLMM/test_dat/out/l_summary_gemma_chr1.txt]
Number of SNP missing: 0
Number of allele discrepency: 2
Number of maf discrepency:    2
After filtering, no large effect SNP is selected.
Fitting model...
Fitting time: 0.017051 seconds.

Are those the results I am supposed to get?

Thanks, Ruocheng

biostat0903 commented 3 years ago

Hi Roucheng, The two output: After filtering, 0 small effect SNPs are selected. and After filtering, no large effect SNP is selected. means that no SNP is included in the model. You can try your own dataset. Thanks for your consideration. Best, Sheng

biostat0903 / DBSLMM

A beginner's question regarding the tutorial code #17