AJResearchGroup / nsphs_ml_qt

R package for nsphs_ml_qt
GNU General Public License v3.0
0 stars 1 forks source link

Always do a PLINK analysis on the data #30

Closed richelbilderbeek closed 2 years ago

richelbilderbeek commented 2 years ago

At #27 we do a PLINK analysis of the data used for #5. It gives a nice comparison. It should be added as well.

richelbilderbeek commented 2 years ago
[richel@sens2021565-bianca ~]$ tail 26_assoc_qt_issue_2*
==> 26_assoc_qt_issue_28_1.log <==
Do quantitative trait association
Error in plinkr::check_plink_is_installed(plink_options) : 
  PLINK is not installed. 
Executable is not found 
PLINK exe path: ~/.local/share/plinkr/plink_1_9_unix/plink 
Tip: run 'plinkr::install_plinks()'
Calls: <Anonymous> -> <Anonymous> -> <Anonymous> -> <Anonymous>
Execution halted
End time: 2022-05-02T15:23:47+0200
Duration: 4 seconds

==> 26_assoc_qt_issue_29_1000.log <==
Do quantitative trait association
Error in plinkr::check_plink_is_installed(plink_options) : 
  PLINK is not installed. 
Executable is not found 
PLINK exe path: ~/.local/share/plinkr/plink_1_9_unix/plink 
Tip: run 'plinkr::install_plinks()'
Calls: <Anonymous> -> <Anonymous> -> <Anonymous> -> <Anonymous>
Execution halted
End time: 2022-05-02T15:24:01+0200
Duration: 4 seconds

==> 26_assoc_qt_issue_29_1.log <==
Do quantitative trait association
Error in plinkr::check_plink_is_installed(plink_options) : 
  PLINK is not installed. 
Executable is not found 
PLINK exe path: ~/.local/share/plinkr/plink_1_9_unix/plink 
Tip: run 'plinkr::install_plinks()'
Calls: <Anonymous> -> <Anonymous> -> <Anonymous> -> <Anonymous>
Execution halted
End time: 2022-05-02T15:19:51+0200
Duration: 5 seconds
richelbilderbeek commented 2 years ago
Do quantitative trait association
Error in plinkr::run_plink(args = args, plink_options = plink_options,  : 
  Error: Failed to open /home/richel/data_issue_28_ae//assoc_qt_temp.bed. 
Called PLINK with commands: 
/opt/plinkr/plink_1_9_unix/plink --bed /home/richel/data_issue_28_ae//assoc_qt_temp.bed --bim /home/richel/data_issue_28_ae//assoc_qt_temp.bim --fam /home/richel/data_issue_28_ae//assoc_qt_temp.fam --pheno /home/richel/data_issue_28_ae//assoc_qt_temp.phe --all-pheno --assoc --maf 2.2250738585072e-308 --out /home/richel/data_issue_28_ae//assoc_qt --allow-extra-chr --chr-set 95
Tip: you should be able to copy-paste this
Calls: <Anonymous> -> <Anonymous> -> <Anonymous>
Execution halted
End time: 2022-05-03T10:33:36+0200
Duration: 6 seconds
richelbilderbeek commented 2 years ago

Works!

[richel@sens2021565-bianca data_issue_28_ae]$ head assoc_qt.P1.qassoc 
 CHR           SNP         BP    NMISS       BETA         SE         R2        T            P 
   1             .  153953041      870    0.06833     0.1539  0.0002271   0.4441       0.6571 
richelbilderbeek commented 2 years ago

But do it in the _ae folder!

Do quantitative trait association
qassoc_filenames$qassoc_filenames: 
 * /home/richel/data_issue_28_ae/assoc_qt.P1.qassoc
qassoc_filenames$log_filename: /home/richel/data_issue_28_ae//assoc_qt.log
End time: 2022-05-03T12:46:19+0200
Duration: 6 seconds
[richel@sens2021565-bianca ~]$ ls
21_create_issue_28_1000_params.log  26_assoc_qt_issue_28_1.log  data_issue_28_100   README.md
21_create_issue_28_100_params.log   98_clean_bianca.sh          data_issue_28_1000  richel-sens2021565
21_create_issue_28_10_params.log    bin                         data_issue_28_ae    script.R
21_create_issue_28_1_params.log     data_issue_28_1             GenoCAE
22_create_issue_28_1_data.log       data_issue_28_10            nsphs_ml_qt
richelbilderbeek commented 2 years ago
[richel@sens2021565-bianca data_issue_28_1]$ cat experiment_params.csv 
parameter,value
datadir,/home/richel/data_issue_28_1/
data,data_issue_28_1
superpops,
model_id,M1
train_opts_id,ex3
data_opts_id,b_0_4
trainedmodeldir,/home/richel/data_issue_28_ae/
richelbilderbeek commented 2 years ago

Works!

Do quantitative trait association
qassoc_filenames$qassoc_filenames: 
 * /home/richel/data_issue_28_1_ae/assoc_qt.P1.qassoc
qassoc_filenames$log_filename: /home/richel/data_issue_28_1_ae//assoc_qt.log
End time: 2022-05-03T13:11:17+0200
Duration: 6 seconds
richelbilderbeek commented 2 years ago

Window of 1M is 4450 SNPs:

[richel@sens2021565-bianca data_issue_28_1000]$ cat data_issue_28_1000.bim | wc
   4450   26700  117988

Running PLINK with those takes less than 1 second:

[richel@sens2021565-bianca ~]$ cat data_issue_28_1000_ae/assoc_qt.log 
PLINK v1.90b6.22 64-bit (16 Apr 2021)
Options in effect:
  --all-pheno
  --allow-extra-chr
  --allow-no-sex
  --assoc
  --bed /home/richel/data_issue_28_1000/data_issue_28_1000.bed
  --bim /home/richel/data_issue_28_1000/data_issue_28_1000.bim
  --chr-set 95
  --fam /home/richel/data_issue_28_1000/data_issue_28_1000.fam
  --maf 2.2250738585072e-308
  --out /home/richel/data_issue_28_1000_ae//assoc_qt
  --pheno /home/richel/data_issue_28_1000/data_issue_28_1000.phe

Hostname: sens2021565-b9.uppmax.uu.se
Working directory: /home/richel
Start time: Tue May  3 13:11:22 2022

Random number seed: 1651576282
117871 MB RAM detected; reserving 58935 MB for main workspace.
4450 variants loaded from .bim file.
870 samples (0 males, 0 females, 870 ambiguous) loaded from .fam.
Ambiguous sex IDs written to /home/richel/data_issue_28_1000_ae//assoc_qt.nosex
.
870 phenotype values present after --pheno.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 870 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.999424.
431 variants removed due to minor allele threshold(s)
(--maf/--max-maf/--mac/--max-mac).
4019 variants and 870 samples pass filters and QC.
Phenotype data is quantitative.
870 phenotype values present after --pheno.
Writing QT --assoc report to
/home/richel/data_issue_28_1000_ae//assoc_qt.P1.qassoc ... done.

End time: Tue May  3 13:11:22 2022
richelbilderbeek commented 2 years ago

The number of SNPs per window size in kilobases:

[richel@sens2021565-bianca ~]$ cat data_issue_28_1/data_issue_28_1.bim |wc
     10      60     227
[richel@sens2021565-bianca ~]$ cat data_issue_28_10/data_issue_28_10.bim | wc
     48     288    1202
[richel@sens2021565-bianca ~]$ cat data_issue_28_100/data_issue_28_100.bim | wc
    469    2814   12206
[richel@sens2021565-bianca ~]$ cat data_issue_28_1000/data_issue_28_1000.bim | wc
   4450   26700  117988