hm reports 'nan' for all parameters at every iteration

IanMcDowell commented 10 years ago

Hi, Using eqtlbma-1.2.2, I first run eqtlbma with the parameters corresponding to BF_{BMA}, that is, "--bfs all --pbf all" with 4 subgroups and everything seems to run smoothly and all BFs are reported for all snp-gene pairs. However, when I run hm like this:

/launch_hm.bash \ --p2b /path/to/hm \ --inp /path/to/"eqtlbma_out.bfbma.gene100*_l10abfs_raw.txt.gz" \ --nbC 15 \ --nbG 10 \ --skip-ci \ --outp /path/to/eqtlbma_hm_out_test &

The stderr reports 'nan' for all parameters and all iterations and runs, seemingly, ad infinitum. There are indeed 15 active configurations (i.e. 4 subgroups) and 10 grid points (in --gridS) and the temporary input_hm_XXXXX.txt file shows reasonable BFs for all configs by grid points. I feel like I may be missing something simple here, any recommendations?

Thanks so much, Ian

timflutre commented 10 years ago

It's indeed a bit puzzling. In the next release (which is almost ready), I integrated more tightly my code (computing the BFs) and the one from my collaborator (fitting the HM). Therefore, I'd suggest you to install the latest version of the master branch (see the wiki) and re-run the programs. The interface changed a little bit, but not that much. Let me know if you have difficulties installing it and if the problem persists.

IanMcDowell commented 10 years ago

I cannot install eqtlbma-master because there is no "configure" script. The zip eqtlbma-1.2.2 and eqtlbma-master have the same files except that eqtlbma-1.2.2 has "configure" and "Makefile.in" which are absent in eqtlbma-master. Would it be appropriate to take "configure" from eqtlbma-1.2.2 and drop it into eqtlbma-master folder, or do you have a new "configure" script or alternate solution? Thanks for your very timely help, Ian

timflutre commented 10 years ago

Can you try with this command: "autoreconf --install --force --symlink"? If it doesn't work, try with these: "libtoolize; automake --add-missing; autoreconf --force". They will generate a configure file.

IanMcDowell commented 10 years ago

The install has worked, however, I'm having trouble running eqtlbma_bf. I ran eqtlbma genome-wide with 4 subgroups successfully using version 1.2.2., however, the same commands (modified to accommodate the changes to 1.3) do no succeed in the new version.

If I run eqtlbma_bf with gcoord, scoord, and snp file:

eqtlbma_bf --geno /path/to/pointer_to_genotypes_by_tissue.chr9.txt --exp /path/to/pointer_to_phenotypes_by_tissue.txt --anchor TSS+TES --cis 400000 --bfs all --gcoord /path/to/eqtlbma_fcoord.gene9529.bed --scoord /path/to/chr9.info4.bed.txt.gz --snp /path/to/eqtlbma_snps.gene9529.txt --type join --outss --out /path/to/eqtlbma_bf_lincQTL_out/eqtlbma_out.bfbma.join.gene9529 --gridL /path/to/eqtlbma_input/grid_phi2_oma2_general.txt.gz --gridS /path/to/eqtlbma_input/grid_phi2_oma2_with-configs.txt.gz --nperm 10000 --seed 1234 --trick 2 --tricut 10 --pbf all --error hybrid

When the genotype data is being loaded, I get the error msg: ERROR: not enough columns on line 2 of file /path/to/chr9.info4.eqtlbma.txt.gz (560 != 561)

Which is untrue. there are 560 columns in the header and in all subsequent lines (with space as the delimiter). It is chiamo format and it worked just fine with 1.2.2.

If I run eqtlbma_bf with gcoord and scoord (no snp file):

eqtlbma_bf --geno /path/to/pointer_to_genotypes_by_tissue.chr9.txt --exp /path/to/pointer_to_phenotypes_by_tissue.txt --anchor TSS+TES --cis 400000 --bfs all --gcoord /path/to/eqtlbma_fcoord.gene9529.bed --scoord /path/to/chr9.info4.bed.txt.gz --type join --outss --out /path/to/eqtlbma_bf_lincQTL_out/eqtlbma_out.bfbma.join.gene9529 --gridL /path/to/eqtlbma_input/grid_phi2_oma2_general.txt.gz --gridS /path/to/eqtlbma_input/grid_phi2_oma2_with-configs.txt.gz --nperm 10000 --seed 1234 --trick 2 --tricut 10 --pbf all --error hybrid

ERROR: not enough columns on line 2 of file /path/to/chr9.info4.eqtlbma.txt.gz (560 != 561)

However, surprisingly, if I run eqtlbma_bf with gcoord and snp file (no scoord), I get the following STDERR:

load file /path/to/eqtlbma_snps.gene9529.txt ...nb of SNPs to keep: 2659 load file /path/to/pointer_to_phenotypes_by_tissue.txt ... items loaded: 4 load file /path/to/pointer_to_genotypes_by_tissue.chr9.txt ... items loaded: 4 analyze 4 subgroups (identifier): adipose (1) artery (2) lung (3) skin_ll (4) load samples ... nb of samples (explevels): 167 adipose: 103 samples artery: 117 samples lung: 121 samples skin_ll: 105 samples nb of samples (genotypes): 185 adipose: 185 samples artery: 185 samples lung: 185 samples skin_ll: 185 samples total nb of samples: 185 nb of samples (covariates): 0 nb of samples with genotype and exp level (pairs of subgroups): adipose-artery: 76 in both, 27 in adipose, 41 in artery adipose-lung: 73 in both, 30 in adipose, 48 in lung adipose-skin_ll: 72 in both, 31 in adipose, 33 in skin_ll artery-lung: 87 in both, 30 in artery, 34 in lung artery-skin_ll: 73 in both, 44 in artery, 32 in skin_ll lung-skin_ll: 77 in both, 44 in lung, 28 in skin_ll load gene coordinates ... total nb of genes with coordinates: 1 load gene expression levels ... adipose (/path/to/adipose_eqtlbma_phenotype.txt.gz): 5056 genes (to keep: 1) artery (/path/to/artery_eqtlbma_phenotype.txt.gz): 4830 genes (to keep: 1) lung (/path/to/lung_eqtlbma_phenotype.txt.gz): 5479 genes (to keep: 1) skin_ll (/path/to/skin_ll_eqtlbma_phenotype.txt.gz): 5334 genes (to keep: 1) total nb of genes to analyze: 1 load genotypes and SNP coordinates ... adipose (/path/to/chr9.info4.eqtlbma.txt.gz): 766885 SNPs (to keep: 2659, loaded in 120.93 sec) duplicate genotypes for other subgroups (same file) ... artery (/path/to/chr9.info4.eqtlbma.txt.gz): 2659 SNPs duplicated in 0.01 sec lung (/path/to/chr9.info4.eqtlbma.txt.gz): 2659 SNPs duplicated in 0.01 sec skin_ll (/path/to/chr9.info4.eqtlbma.txt.gz): 2659 SNPs duplicated in 0.01 sec nb of SNPs: 2659 load grid in /path/to/eqtlbma_input/grid_phi2_oma2_general.txt.gz ... grid size: 25 load grid in /path/to/eqtlbma_input/grid_phi2_oma2_with-configs.txt.gz ... grid size: 10 look for association between each pair gene-SNP ... likelihood=normal errors=hybrid prop_cov_errors=0.50 anchor=TSS+TES radius=400000 ==================================================100.00% (0.00 sec) nb of analyzed gene-SNP pairs: 0 (0 genes) get gene-level p-values by permuting expression levels ... permutations=10000 seed=1234 trick=2 trick_cutoff=10 perm_bf=all threads=1 join==================================================100.00% (0.00 sec) write results of summary statistics in each subgroup ... file /path/to/eqtlbma_bf_lincQTL_out/eqtlbma_out.bfbma.join.gene9529_sumstats_adipose.txt.gz file /path/to/eqtlbma_bf_lincQTL_out/eqtlbma_out.bfbma.join.gene9529_sumstats_artery.txt.gz file /path/to/eqtlbma_bf_lincQTL_out/eqtlbma_out.bfbma.join.gene9529_sumstats_lung.txt.gz file /path/to/eqtlbma_bf_lincQTL_out/eqtlbma_out.bfbma.join.gene9529_sumstats_skin_ll.txt.gz write results of Bayes Factors (one per grid value) ... file /path/to/eqtlbma_bf_lincQTL_out/eqtlbma_out.bfbma.join.gene9529_l10abfs_raw.txt.gz write results of gene-level P-values, all subgroups jointly ... file /path/to/eqtlbma_bf_lincQTL_out/eqtlbma_out.bfbma.join.gene9529_joinPermPvals.txt.gz END eqtlbma_bf 2014-02-25 23:22:16 elapsed -> 0d 0h 2m 2s max.mem -> 19516 kB

As you can see it "worked" in some sense but no SNP-gene pairs were tested and so no results were send to STDOUT,

gcoord file is: chr9 141090383 141093775 gene9529

I've also tried just the TSS, that is, chr9 141090383 141090383 gene9529

(also, I ran the above commands with --anchor TSS instead of TSS+TES)

scoord is: chr9 90801 90801 rs117629664 chr9 91012 91012 rs141104331 chr9 91046 91046 rs139850491 chr9 91941 91941 rs138824502 chr9 93432 93432 rs145650527 ...

genotype files are as described in manual

I'm at a loss, I've tried all combinations and can find nothing wrong in my bed files, --exp files, or --geno files. Plus, because the same files worked with v. 1.2.2., I have some validation that they should be correct. Any suggestions? Thanks again,

Ian

timflutre commented 10 years ago

1) Did you try the command "autoreconf --install --force --symlink"? Did it work for you?

2) The code tries to guess the format of the genotype files. When the --scoord is given, the code doesn't expect the genotypes to be in the CHIAMO/IMPUTE format (nor the VCF format) because this format already contains the SNP coordinates. As a result, the code expects to find the genotypes in the custom format, which creates the error you saw. The manual is not clear about this and the error message even less. I'm sorry about this and I have changed the code. You can now retrieve the latest version of the master branch and let me know if the problem persists.

3) In the BED format, the start coordinate should be 0-based, and the end coordinate should be 1-based. See the UCSC documentation. This means that you shouldn't have something like "chr9 90801 90801 rs117629664". I also updated the code to detect such a case.

4) Only use --anchor TSS for the moment, not --anchor TSS+TES. I realized that, to do it well, the latter needs strand information and the code should allow different cis lengths for TSS and TES, which I didn't implement yet.

IanMcDowell commented 10 years ago

The cmd "autoreconf --install --force --symlink" did not work for me. Instead, I used:

libtoolize cat /path/to/bin/share/aclocal/libtool.m4 \ /path/to/bin/share/aclocal/ltoptions.m4 \ /path/to/bin/share/aclocal/ltversion.m4 \ /path/to/bin/share/aclocal/ltsugar.m4 \ /path/to/bin/share/aclocal/lt~obsolete.m4 >> aclocal.m4 printf "AC_CONFIG_MACRO_DIR([m4])" >> configure.ac vi Makefile.am [add:] ACLOCAL_AMFLAGS = -I m4 [save] libtoolize aclocal automake --add-missing autoreconf --force

All tests succeed.

As far as the problems described above, I see now that the scoord was formatted incorrectly, but scoord was not necessary anyway given CHIAMO format.

Still, I've tried every combination of properly formatted input and eqtlbma_bf seems to work perfectly but fails predictably here:

likelihood=normal errors=hybrid prop_cov_errors=0.50 anchor=TSS radius=400000 ==================================================100.00% (0.00 sec) nb of analyzed gene-SNP pairs: 0 (0 genes) get gene-level p-values by permuting expression levels ... permutations=10000 seed=1234 trick=2 trick_cutoff=10 perm_bf=all threads=1 join==================================================100.00% (0.00 sec)

The gene(s) load properly, the SNPs load, but together there are no gene-SNP pairs (yet trust me, there are +2000 in the range given). I'm trying to troubleshoot, but if you have any idea what might be happening at this step, that would be a great help.

timflutre commented 10 years ago

Can you give the full command-line?

IanMcDowell commented 10 years ago

/path/to/bin/bin/eqtlbma_bf \ --geno /path/to/eqtlbma_auxiliary/pointer_to_genotypes_by_tissue.chr9.txt \ --exp /path/to/eqtlbma_auxiliary/pointer_to_phenotypes_by_tissue.txt \ --anchor TSS --cis 400000 --bfs all --type join --outss \ --gcoord /path/to/eqtlbma_auxiliary/eqtlbma_gcoord.199.bed.gz \ --out /path/to/eqtlbma_bf_lincQTL_out/eqtlbma_out.bfbma.join.199 \ --gridL /path/to/eqtlbma_input/grid_phi2_oma2_general.txt.gz \ --gridS /path/to/eqtlbma_input/grid_phi2_oma2_with-configs.txt.gz \ --nperm 10000 --seed 1234 --trick 2 --tricut 10 --pbf all --error hybrid

where
pointer_to_genotypes_by_tissue.chr9.txt = adipose /path/to/chr9.info4.eqtlbma.txt.gz artery /path/to/chr9.info4.eqtlbma.txt.gz lung /path/to/chr9.info4.eqtlbma.txt.gz skin_ll /path/to/chr9.info4.eqtlbma.txt.gz

pointer_to_phenotypes_by_tissue.txt = adipose /path/to/adipose_eqtlbma_phenotype.txt.gz artery /path/to/artery_eqtlbma_phenotype.txt.gz lung /path/to/lung_eqtlbma_phenotype.txt.gz skin_ll /path/to/skin_ll_eqtlbma_phenotype.txt.gz

zcat /path/to/eqtlbma_gcoord.199.bed.gz = chr9 98782013 98784042 gene9417 1000 - chr9 98787997 98790247 gene9419 1000 - chr9 98857492 98864194 gene9420 1000 + chr9 99449337 99483403 gene9421 1000 + chr9 100000045 100000960 gene9423 1000 - chr9 100505489 100507217 gene9424 1000 + chr9 102117618 102118546 gene9427 1000 - chr9 102117691 102137509 gene9430 1000 - chr9 102128698 102134794 gene9431 1000 - chr9 105902804 106087316 gene9436 1000 -

timflutre commented 10 years ago

1) Can you relaunch it with -v 2 or -v 3? (a higher verbose level) And what do you get on stdout/stderr?

2) Do you have missing exp levels? If yes, how did you encode the missing values? Here is what is expected: if gene A is present in subgroup 1 but absent in subgroup 2, it should still be present in both files, with "NA" for the file corresponding to subgroup 2. Does it help?

IanMcDowell commented 10 years ago

2) I have no missing exp levels, I am specifically testing genes that are expressed across all subgroups

The problem instead seems to stem from a failure to find cis SNPs (and I am absolutely certain that there are cis SNPs). Also, the tutorial example works, so that routine to find cis SNPs is not completely failing in my environment, but possibly, SNP coords are incorrectly stored or read from CHIAMO format. (I am guessing because the tutorial uses 'custom' format.) 1) -v 3: ... total nb of genes with coordinates: 10 load gene expression levels ... adipose (/path/to/adipose_eqtlbma_phenotype.txt.gz): 5056 genes (to keep: 10) artery (/path/to/artery_eqtlbma_phenotype.txt.gz): 4830 genes (to keep: 10) lung (/path/to/lung_eqtlbma_phenotype.txt.gz): 5479 genes (to keep: 10) skin_ll (/path/to/skin_ll_eqtlbma_phenotype.txt.gz): 5334 genes (to keep: 10) total nb of genes to analyze: 10 load genotypes and SNP coordinates ... adipose (/path/to/chr9.info4.eqtlbma.txt.gz): 766885 SNPs (to keep: 766885, loaded in 77.88 sec) duplicate genotypes for other subgroups (same file) ... artery (/path/to/chr9.info4.eqtlbma.txt.gz): 766885 SNPs duplicated in 1.61 sec lung (/path/to/chr9.info4.eqtlbma.txt.gz): 766885 SNPs duplicated in 1.74 sec skin_ll (/path/to/chr9.info4.eqtlbma.txt.gz): 766885 SNPs duplicated in 1.88 sec nb of SNPs: 766885 load grid in /path/to/eqtlbma_input/grid_phi2_oma2_general.txt.gz ... grid size: 25 load grid in /path/to/eqtlbma_input/grid_phi2_oma2_with-configs.txt.gz ... grid size: 10 look for association between each pair gene-SNP ... likelihood=normal errors=hybrid prop_cov_errors=0.50 anchor=TSS radius=100000 gene gene9417 WARNING: skip gene gene9417 because it has no SNP in cis gene gene9419 WARNING: skip gene gene9419 because it has no SNP in cis gene gene9420 WARNING: skip gene gene9420 because it has no SNP in cis gene gene9421 WARNING: skip gene gene9421 because it has no SNP in cis gene gene9423 WARNING: skip gene gene9423 because it has no SNP in cis gene gene9424 WARNING: skip gene gene9424 because it has no SNP in cis gene gene9427 WARNING: skip gene gene9427 because it has no SNP in cis gene gene9430 WARNING: skip gene gene9430 because it has no SNP in cis gene gene9431 WARNING: skip gene gene9431 because it has no SNP in cis gene gene9436 WARNING: skip gene gene9436 because it has no SNP in cis nb of analyzed gene-SNP pairs: 0 (0 genes) get gene-level p-values by permuting expression levels ... permutations=10000 seed=1234 trick=2 trick_cutoff=10 perm_bf=all threads=1 gene gene9417 gene gene9419 gene gene9420 gene gene9421 gene gene9423 gene gene9424 gene gene9427 gene gene9430 gene gene9431 gene gene9436 write results of summary statistics in each subgroup ... file /path/to/eqtlbma_bf_lincQTL_out/eqtlbma_out.bfbma.join.199_sumstats_adipose.txt.gz file /path/to/eqtlbma_bf_lincQTL_out/eqtlbma_out.bfbma.join.199_sumstats_artery.txt.gz file /path/to/eqtlbma_bf_lincQTL_out/eqtlbma_out.bfbma.join.199_sumstats_lung.txt.gz file /path/to/eqtlbma_bf_lincQTL_out/eqtlbma_out.bfbma.join.199_sumstats_skin_ll.txt.gz write results of Bayes Factors (one per grid value) ... file /path/to/eqtlbma_bf_lincQTL_out/eqtlbma_out.bfbma.join.199_l10abfs_raw.txt.gz write results of gene-level P-values, all subgroups jointly ... file /path/to/eqtlbma_bf_lincQTL_out/eqtlbma_out.bfbma.join.199_joinPermPvals.txt.gz END eqtlbma_bf 2014-02-26 12:07:12 elapsed -> 0d 0h 1m 27s max.mem -> 4868508 kB

timflutre commented 10 years ago

Ok, I'll look into this and come back to you asap!

IanMcDowell commented 10 years ago

I found a workaround to the above issue by converting my genotypes to allele dosage format. So don't stress over solving that issue right away. It looks like allele dosage format + scoord is the way to go until the chiamo format is working better.

Now, all of my eqtlbma_bf associations run to completion but I very often get gzip stream (?) errors, like this: stderr: ... nb of analyzed gene-SNP pairs: 1439 (5 genes) write results of Bayes Factors (one per grid value) ... file /path/to/eqtlbma_out.bfbma.join.126_l10abfs_raw.txt.gz Segmentation fault (core dumped)

The errors are reproducible (and reproducibly absent from some genes). I am splitting up jobs to 1 gene at a time now, perhaps a smaller gzip stream will not cause faults.

IanMcDowell commented 10 years ago

Oh, and even under -v 3 verbosity, what I reproduced above is the most you can find out about that error without going into the code.

IanMcDowell commented 10 years ago

One gene at a time didn't work for all genes either. I still have the segmentation fault (and consequently unexpected end of gzip file) for some genes. So I don't have a workaround for this issue.

IanMcDowell commented 10 years ago

Update, the error occurs when I run some genes in eqtlbma_bf with --error hybrid, but not with --error uvlr, yet other genes run to completion without error, so it's not the output stream that is the issue (that was just a red herring due to the unexpected end of gzip file). Why might this be the case?

timflutre commented 10 years ago

Sorry, I was out of the office all day long. I'll be looking at this "--error hybrid" thing tomorrow (European time). As this never happened so far, I'll need to add verbose/debug messages and see. It'd help me a lot if you could send me, on my gmail address, a small subset of the data (e.g. one gene-snp pair) for which the error occurs. Would it be possible?

timflutre commented 10 years ago

I found the bug about the IMPUTE/CHIAMO format, and fixed it (with a new test). Sorry again for that.

Quick question: are you using --error hybrid because you have some individuals missing in some subgroups? Or because you have some missing exp levels in some subgroups for some genes? To find the cause of the seg fault, it'd really help me to get an minimal example of your data, because right now I can't replicate it.

IanMcDowell commented 10 years ago

Now I have to apologize, sorry for the delay, I just used --error uvlr instead, I'm working under a deadline and I had to get through eqtlbma_bf. I wanted to use hybrid because I have missing exp levels for some individuals in some subgroups. That is, individual_1 may have expression in tissue_type_1 and tissue_type_2, while individual_2 may have expression in tissue_type_2 and tissue_type_3, and so one for ~150 ind. and 4 tissue types.

I sent you files to recreate the errors I described.

After running eqtbma_bf (with error uvlr), I ran into some more problems with eqtlbma_hm. When I run the results of one gene (and all its snp pairs) in eqtlbma_hm I get the error "ERROR: log10(obslik) is NaN".

/data/engelhardtlab/lincrna_eqtl/bin/bin/eqtlbma_hm \ --data hm_test.gz \ --nsubgrp 4 --dim 15 --ngrid 10 \ --out hm_test_out

START eqtlbma_hm 2014-02-28 20:48:54 version 1.3 compiled Feb 27 2014 14:08:50 cmd-line: /path/bin/bin/eqtlbma_hm --data hm_test.gz --nsubgrp 4 --dim 15 --ngrid 10 --out hm_test_out cwd: /path/... load data ... nb of input files: 1 ==================================================100.00% finish loading 1 genes and 1706 gene-snp pairs (0.120000 sec, 17548 kB) run EM algorithm ... ERROR: log10(obslik) is NaN

When I run the results of several gene (and all their snp pairs) in eqtlbma_hm I get the another error, see below:

/path/bin/bin/eqtlbma_hm --data "eqtlbma_out.bfbma.join.029[0-9][0-9].true_l10abfs_raw.txt.gz" --nsubgrp 4 --dim 15 --ngrid 10 --out hm_test_multiple_out START eqtlbma_hm 2014-02-28 20:51:47 version 1.3 compiled Feb 27 2014 14:08:50 cmd-line: ... load data ... nb of input files: 100 ====== 12.00%terminate called after throwing an instance of 'std::logic_error' what(): basic_string::_S_construct NULL not valid Aborted (core dumped)

Again, to remind you all my tests ran successfully following compilation and this includes the eqtlbma_hm test. Some of the eqtlbma_bf results do indeed run to completion in eqtlbma_hm and some do not. I believe I've located the error. It appears that there are values returned by eqtlbma_bf, specifically "0.000000e+00", "-nan", and "nan", which -when located in certain cells - disrupt eqtlbma_hm. Because I used "--bfs all --pbf all", there should be non-zero, non-null values at all in fields "l10abf.grid1" through "l10abf.grid25" for gen, gen-fix, gen-maxh and "l10abf.grid1" through "l10abf.grid10" for all active configurations (e.g. '1-2', '1-2-3-4', etc.)

If I run a simple awk script (shown below, excuse the clunkiness), replacing "0.000000e+00", "-nan", and "nan" where they appear in the above mentioned cells with "1.000000e-99", then eqtlbma_hm can run to completion without error (shown below as well). Please tell me if this operation is not legitimate, it appears reasonable to me. I've sent you the file hm_test.gz so you can repeat these tests yourself.

zcat hm_test.gz hm_test

awk 'BEGIN{OFS=FS="\t"} \ (($4=="0.000000e+00")||($4=="nan")||($4=="-nan")) {$4="1.000000e-99"} \ (($5=="0.000000e+00")||($5=="nan")||($5=="-nan")) {$5="1.000000e-99"} \ (($6=="0.000000e+00")||($6=="nan")||($6=="-nan")) {$6="1.000000e-99"} \ (($7=="0.000000e+00")||($7=="nan")||($7=="-nan")) {$7="1.000000e-99"} \ (($8=="0.000000e+00")||($8=="nan")||($8=="-nan")) {$8="1.000000e-99"} \ (($9=="0.000000e+00")||($9=="nan")||($9=="-nan")) {$9="1.000000e-99"} \ (($10=="0.000000e+00")||($10=="nan")||($10=="-nan")) {$10="1.000000e-99"} \ (($11=="0.000000e+00")||($11=="nan")||($11=="-nan")) {$11="1.000000e-99"} \ (($12=="0.000000e+00")||($12=="nan")||($12=="-nan")) {$12="1.000000e-99"} \ (($13=="0.000000e+00")||($13=="nan")||($13=="-nan")) {$13="1.000000e-99"} \ (($14=="0.000000e+00")||($14=="nan")||($14=="-nan")) && (($3=="gen")||($3=="gen-fix")||($3=="gen-maxh")) {$14="1.000000e-99"} \ (($15=="0.000000e+00")||($15=="nan")||($15=="-nan")) && (($3=="gen")||($3=="gen-fix")||($3=="gen-maxh")) {$15="1.000000e-99"} \ (($16=="0.000000e+00")||($16=="nan")||($16=="-nan")) && (($3=="gen")||($3=="gen-fix")||($3=="gen-maxh")) {$16="1.000000e-99"} \ (($17=="0.000000e+00")||($17=="nan")||($17=="-nan")) && (($3=="gen")||($3=="gen-fix")||($3=="gen-maxh")) {$17="1.000000e-99"} \ (($18=="0.000000e+00")||($18=="nan")||($18=="-nan")) && (($3=="gen")||($3=="gen-fix")||($3=="gen-maxh")) {$18="1.000000e-99"} \ (($19=="0.000000e+00")||($19=="nan")||($19=="-nan")) && (($3=="gen")||($3=="gen-fix")||($3=="gen-maxh")) {$19="1.000000e-99"} \ (($10=="0.000000e+00")||($20=="nan")||($20=="-nan")) && (($3=="gen")||($3=="gen-fix")||($3=="gen-maxh")) {$20="1.000000e-99"} \ (($21=="0.000000e+00")||($21=="nan")||($21=="-nan")) && (($3=="gen")||($3=="gen-fix")||($3=="gen-maxh")) {$21="1.000000e-99"} \ (($22=="0.000000e+00")||($22=="nan")||($22=="-nan")) && (($3=="gen")||($3=="gen-fix")||($3=="gen-maxh")) {$22="1.000000e-99"} \ (($23=="0.000000e+00")||($23=="nan")||($23=="-nan")) && (($3=="gen")||($3=="gen-fix")||($3=="gen-maxh")) {$23="1.000000e-99"} \ (($24=="0.000000e+00")||($24=="nan")||($24=="-nan")) && (($3=="gen")||($3=="gen-fix")||($3=="gen-maxh")) {$24="1.000000e-99"} \ (($25=="0.000000e+00")||($25=="nan")||($25=="-nan")) && (($3=="gen")||($3=="gen-fix")||($3=="gen-maxh")) {$25="1.000000e-99"} \ (($26=="0.000000e+00")||($26=="nan")||($26=="-nan")) && (($3=="gen")||($3=="gen-fix")||($3=="gen-maxh")) {$26="1.000000e-99"} \ (($27=="0.000000e+00")||($27=="nan")||($27=="-nan")) && (($3=="gen")||($3=="gen-fix")||($3=="gen-maxh")) {$27="1.000000e-99"} \ (($28=="0.000000e+00")||($28=="nan")||($28=="-nan")) && (($3=="gen")||($3=="gen-fix")||($3=="gen-maxh")) {$28="1.000000e-99"} \ {print}' hm_test > hm_test_rezeroed

gzip hm_test_rezeroed

/path/bin/bin/eqtlbma_hm --data hm_test_rezeroed.gz --nsubgrp 4 --dim 15 --ngrid 10 --out hm_test_out START eqtlbma_hm 2014-02-28 21:18:15 version 1.3 compiled Feb 27 2014 14:08:50 cmd-line: ... load data ... nb of input files: 1 ==================================================100.00% finish loading 1 genes and 1706 gene-snp pairs (0.280000 sec, 18180 kB) run EM algorithm ... iter 0 loglik -3.6409e-02 pi0 5.0000e-01 configs 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 grid-points 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 iter 1 loglik -1.8507e-02 pi0 5.4373e-01 configs 6.0482e-02 6.7217e-02 7.6221e-02 6.3053e-02 5.5465e-02 6.0690e-02 6.1106e-02 8.3288e-02 5.8577e-02 7.4847e-02 6.3191e-02 5.5801e-02 7.0953e-02 7.6944e-02 7.2164e-02 grid-points 1.2402e-01 1.2499e-01 1.2420e-01 1.2604e-01 1.0578e-01 1.1097e-01 7.6952e-02 8.7782e-02 5.3254e-02 6.6004e-02 iter 2 loglik -6.6745e-03 pi0 5.6740e-01 configs 5.3729e-02 6.5817e-02 8.4476e-02 5.8306e-02 4.5713e-02 5.4854e-02 5.5241e-02 1.0246e-01 5.0881e-02 8.3350e-02 5.9429e-02 4.6173e-02 7.4630e-02 8.8482e-02 7.6463e-02 grid-points 1.4303e-01 1.4543e-01 1.4477e-01 1.4948e-01 1.0522e-01 1.1652e-01 5.5343e-02 7.2787e-02 2.6379e-02 4.1040e-02 EM ran for 0d 0h 0m 0s save the results in hm_test_out ... (0.00 sec) END eqtlbma_hm 2014-02-28 21:18:16 elapsed -> 0d 0h 0m 1s max.mem -> 18180 kB

Again, thanks for the help.

IanMcDowell commented 10 years ago

Update. It appears the error: "terminate called after throwing an instance of 'std::logic_error' what(): basic_string::Sconstruct NULL not valid Aborted (core dumped)" occurs for more reasons than the reasons listed in the last comment in the issue thread (that is, the zero and nan values in certain fields/cells).

As I demonstrate below, I try eqtlbma_hm on 100 files and it fails. I try it on 10 files and it fails. I try it on 3 files and it succeeds, so I expand the pattern to include one more file and try eqtlbma_hm on 4 files and it fails. I proceed to test eqtlbma_hm on each of these 4 files separately and it runs successfully on each. I've repeated this test on a number of eqtlbma_bf output files and I've found that eqtlbma_hm can run to completion when analyzing 4-7 eqtlbma_bf output files, then fails with any more, sometimes with "what(): basic_string::_S_construct NULL not valid", and other times with "log10(obslik) is NaN." And when any given individual file is fed into eqtlbma_hm it runs to completion.

Let's try eqtlbma_hm on 100 files:

/path/eqtlbma_hm --data /path/"eqtlbma_out.bfbma.join.002[0-9][0-9].true_rezeroed_l10abfs_raw.txt.gz" --nsubgrp 4 --dim 15 --ngrid 10 --out /path/eqtlbma_hm_out.bfbma.join.true.test START eqtlbma_hm 2014-03-01 09:29:18 version 1.3 compiled Feb 27 2014 14:08:50 cmd-line: /path/eqtlbma_hm --data /path/eqtlbma_out.bfbma.join.002[0-9][0-9].true_rezeroed_l10abfs_raw.txt.gz --nsubgrp 4 --dim 15 --ngrid 10 --out /path/eqtlbma_hm_out.bfbma.join.true.test cwd: /path/eqtlbma_hm_lincQTL_out load data ... nb of input files: 100 === 7.00%terminate called after throwing an instance of 'std::logic_error' what(): basic_string::_S_construct NULL not valid Aborted (core dumped)

How about 10 files?

/path/eqtlbma_hm --data /path/"eqtlbma_out.bfbma.join.0020[0-9].true_rezeroed_l10abfs_raw.txt.gz" --nsubgrp 4 --dim 15 --ngrid 10 --out /path/eqtlbma_hm_out.bfbma.join.true.test START eqtlbma_hm 2014-03-01 09:29:48 version 1.3 compiled Feb 27 2014 14:08:50 cmd-line: /path/eqtlbma_hm --data /path/eqtlbma_out.bfbma.join.0020[0-9].true_rezeroed_l10abfs_raw.txt.gz --nsubgrp 4 --dim 15 --ngrid 10 --out /path/eqtlbma_hm_out.bfbma.join.true.test cwd: /path/eqtlbma_hm_lincQTL_out load data ... nb of input files: 10 =================================== 70.00%terminate called after throwing an instance of 'std::logic_error' what(): basic_string::_S_construct NULL not valid Aborted (core dumped)

How about 3 files?

/path/eqtlbma_hm --data /path/"eqtlbma_out.bfbma.join.0020[7-9].true_rezeroed_l10abfs_raw.txt.gz" --nsubgrp 4 --dim 15 --ngrid 10 --out /path/eqtlbma_hm_out.bfbma.join.true.test START eqtlbma_hm 2014-03-01 09:30:03 version 1.3 compiled Feb 27 2014 14:08:50 cmd-line: /path/eqtlbma_hm --data /path/eqtlbma_out.bfbma.join.0020[7-9].true_rezeroed_l10abfs_raw.txt.gz --nsubgrp 4 --dim 15 --ngrid 10 --out /path/eqtlbma_hm_out.bfbma.join.true.test cwd: /path/eqtlbma_hm_lincQTL_out load data ... nb of input files: 3 ==================================================100.00% finish loading 3 genes and 2938 gene-snp pairs (0.720000 sec, 20704 kB) run EM algorithm ... iter 0 loglik 3.2746e+00 pi0 5.0000e-01 configs 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 grid-points 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 ... EM ran for 0d 0h 0m 2s save the results in /path/eqtlbma_hm_out.bfbma.join.true.test ... (0.00 sec) END eqtlbma_hm 2014-03-01 09:30:06 elapsed -> 0d 0h 0m 3s max.mem -> 20896 kB

That's the ticket!

Is there a problem file? Let's expand the number of files again to 4:

/path/eqtlbma_hm --data /path/"eqtlbma_out.bfbma.join.0020[6-9].true_rezeroed_l10abfs_raw.txt.gz" --nsubgrp 4 --dim 15 --ngrid 10 --out /path/eqtlbma_hm_out.bfbma.join.true.test START eqtlbma_hm 2014-03-01 09:30:16 version 1.3 compiled Feb 27 2014 14:08:50 cmd-line: /path/eqtlbma_hm --data /path/eqtlbma_out.bfbma.join.0020[6-9].true_rezeroed_l10abfs_raw.txt.gz --nsubgrp 4 --dim 15 --ngrid 10 --out /path/eqtlbma_hm_out.bfbma.join.true.test cwd: /path/eqtlbma_hm_lincQTL_out load data ... nb of input files: 4 ==================================================100.00% finish loading 5 genes and 4000 gene-snp pairs (0.940000 sec, 22840 kB) run EM algorithm ... ERROR: log10(obslik) is NaN

Ok, if that doesn't work, perhaps we can run each file [6-9] separately and figure out which one is the problem:

/path/eqtlbma_hm --data /path/"eqtlbma_out.bfbma.join.00206.true_rezeroed_l10abfs_raw.txt.gz" --nsubgrp 4 --dim 15 --ngrid 10 --out /path/eqtlbma_hm_out.bfbma.join.true.test START eqtlbma_hm 2014-03-01 09:30:38 version 1.3 compiled Feb 27 2014 14:08:50 cmd-line: /path/eqtlbma_hm --data /path/eqtlbma_out.bfbma.join.00206.true_rezeroed_l10abfs_raw.txt.gz --nsubgrp 4 --dim 15 --ngrid 10 --out /path/eqtlbma_hm_out.bfbma.join.true.test cwd: /path/eqtlbma_hm_lincQTL_out load data ... nb of input files: 1 ==================================================100.00% finish loading 1 genes and 1061 gene-snp pairs (0.180000 sec, 11996 kB) run EM algorithm ... iter 0 loglik -3.7330e-02 pi0 5.0000e-01 configs 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 grid-points 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 ... EM ran for 0d 0h 0m 0s save the results in /path/eqtlbma_hm_out.bfbma.join.true.test ... (0.00 sec) END eqtlbma_hm 2014-03-01 09:30:38 elapsed -> 0d 0h 0m 0s max.mem -> 11996 kB /path/eqtlbma_hm --data /path/"eqtlbma_out.bfbma.join.00207.true_rezeroed_l10abfs_raw.txt.gz" --nsubgrp 4 --dim 15 --ngrid 10 --out /path/eqtlbma_hm_out.bfbma.join.true.test START eqtlbma_hm 2014-03-01 09:30:49 version 1.3 compiled Feb 27 2014 14:08:50 cmd-line: /path/eqtlbma_hm --data /path/eqtlbma_out.bfbma.join.00207.true_rezeroed_l10abfs_raw.txt.gz --nsubgrp 4 --dim 15 --ngrid 10 --out /path/eqtlbma_hm_out.bfbma.join.true.test cwd: /path/eqtlbma_hm_lincQTL_out load data ... nb of input files: 1 ==================================================100.00% finish loading 1 genes and 1061 gene-snp pairs (0.180000 sec, 12008 kB) run EM algorithm ... iter 0 loglik -3.7293e-02 pi0 5.0000e-01 configs 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 grid-points 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 ... EM ran for 0d 0h 0m 1s save the results in /path/eqtlbma_hm_out.bfbma.join.true.test ... (0.00 sec) END eqtlbma_hm 2014-03-01 09:30:50 elapsed -> 0d 0h 0m 1s max.mem -> 12008 kB /path/eqtlbma_hm --data /path/"eqtlbma_out.bfbma.join.00208.true_rezeroed_l10abfs_raw.txt.gz" --nsubgrp 4 --dim 15 --ngrid 10 --out /path/eqtlbma_hm_out.bfbma.join.true.test START eqtlbma_hm 2014-03-01 09:30:58 version 1.3 compiled Feb 27 2014 14:08:50 cmd-line: /path/eqtlbma_hm --data /path/eqtlbma_out.bfbma.join.00208.true_rezeroed_l10abfs_raw.txt.gz --nsubgrp 4 --dim 15 --ngrid 10 --out /path/eqtlbma_hm_out.bfbma.join.true.test cwd: /path/eqtlbma_hm_lincQTL_out load data ... nb of input files: 1 ==================================================100.00% finish loading 1 genes and 859 gene-snp pairs (0.140000 sec, 10088 kB) run EM algorithm ... iter 0 loglik 1.3337e+00 pi0 5.0000e-01 configs 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 grid-points 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 ... EM ran for 0d 0h 0m 0s save the results in /path/eqtlbma_hm_out.bfbma.join.true.test ... (0.01 sec) END eqtlbma_hm 2014-03-01 09:30:58 elapsed -> 0d 0h 0m 0s max.mem -> 10088 kB /path/eqtlbma_hm --data /path/"eqtlbma_out.bfbma.join.00209.true_rezeroed_l10abfs_raw.txt.gz" --nsubgrp 4 --dim 15 --ngrid 10 --out /path/eqtlbma_hm_out.bfbma.join.true.test START eqtlbma_hm 2014-03-01 09:31:06 version 1.3 compiled Feb 27 2014 14:08:50 cmd-line: /path/eqtlbma_hm --data /path/eqtlbma_out.bfbma.join.00209.true_rezeroed_l10abfs_raw.txt.gz --nsubgrp 4 --dim 15 --ngrid 10 --out /path/eqtlbma_hm_out.bfbma.join.true.test cwd: /path/eqtlbma_hm_lincQTL_out load data ... nb of input files: 1 ==================================================100.00% finish loading 1 genes and 1018 gene-snp pairs (0.170000 sec, 11532 kB) run EM algorithm ... iter 0 loglik 1.9782e+00 pi0 5.0000e-01 configs 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 6.6667e-02 grid-points 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 1.0000e-01 ... EM ran for 0d 0h 0m 0s save the results in /path/eqtlbma_hm_out.bfbma.join.true.test ... (0.00 sec) END eqtlbma_hm 2014-03-01 09:31:07 elapsed -> 0d 0h 0m 1s max.mem -> 11532 kB

IanMcDowell commented 10 years ago

I have the feeling that you will be able to reproduce some of my errors yourself if you simulate genotypes with very low minor allele frequency. It may be that eqtlbma cannot handle genotypes wherein only a couple individuals have a copy of the minor allele. I am using imputed genotype data and I did not filter for MAF.

IanMcDowell commented 10 years ago

Update again. There are at least two problems with eqtlbma_hm, one is the inability to deal with very low minor allele frequency, which returns this error: ERROR: log10(obslik) is NaN

and the inability to properly merge a number of eqtlbma_bf output files, which returns this error: terminate called after throwing an instance of 'std::logic_error' what(): basic_string::Sconstruct NULL not valid Aborted (core dumped)

To solve the first error I removed all gene-snp pairs that were uninformative (BFs that are very nearly 0 or nan in the cells indicated above, that is, "l10abf.grid1" through "l10abf.grid25" for gen, gen-fix, gen-maxh records and "l10abf.grid1" through "l10abf.grid10" for all active configuration records).

To solve the second error, I manually concatenated all eqtlbma_bf results (with a singular header).

Now it works.

IanMcDowell commented 10 years ago

I have reason to believe now that some of the "nan" values that I disposed of were actually masking what should be informative bayes factor values. When I remove all gene-snp pairs that are uninformative (BFs that are very nearly 0 or nan in the cells indicated above, that is, "l10abf.grid1" through "l10abf.grid25" for gen, gen-fix, gen-maxh records and "l10abf.grid1" through "l10abf.grid10" for all active configuration records), 2924493 gene-snp pairs remain for my true (un-permuted) data whereas 3662514.3 gene-snp pairs (on average, with very little variance) remain if I do the same eqtlbma workflow on permuted data (my own permutation procedure, simple random label reshuffle). I hypothesize (fear) that the +700 K gene-snp pairs filtered in the true (un-permuted) data may then be informative SNPs. I'll try to get you some example data, but I would, again, recommend simply generating some data that is a little more messy than the tutorial simulation data (low MAF for one thing).

timflutre commented 10 years ago

Sorry for the delay. I have issues with my laptop right now, so I'll let you know asap. For the moment, you can filter SNPs with low MAF (say < 5%) yourself or using option --maf 0.05.

IanMcDowell commented 10 years ago

That's okay. Yes, I know that option is available, but not all researchers will want to filter SNPs with low MAF (myself included). In future releases, perhaps you should make the default minimum MAF 0.05 or consider handling lower allele frequencies, because otherwise eqtlbma may yield strange results.

timflutre commented 10 years ago

Hi Ian, I solved my laptop problem and found some time to look into your issues.

1) Your detailed reports (thanks for that!) mention that eqtlbma can't handle SNPs with low minor allele frequencies (MAF). To start with, such SNPs should not be a problem as, in theory, the Bayes factor deals automatically with them (with low MAF, the effects of such SNPs will have very large std errors). To test this, and following your advice, I now added to the tutorial a default of 10% of the SNPs with a MAF below 5% (more specifically, their MAF is centered on 0.02). The R script now also plots the histogram of the MAF per SNP, so you can easily check. Moreover, as expected by the theory, when carrying on with the tutorial, we can see that, in practice too, these SNPs don't cause the observed log-likelihood to be NaN in eqtlbma_hm, and therefore don't need to be filtered out beforehand with option --maf in eqtlbma_bf (even if it's possible). I hence don't really know what is causing what you observe. Maybe there is something specific to your data set?

2) You also mention that eqtlbma_hm is unable to properly merge several output files from eqtlbma_bf. I checked this by splitting the data simulated in the tutorial. Indeed, for some files, the C++ code reads "one line too far" (more details here). I recently changed this part of the code to read files quicker. However, this causes sometimes the kind of problem you encountered. I thus reversed back to the previous version, which is less efficient, but at least doesn't have the bug!

Don't hesitate to follow-up on these if you think I missed something.

timflutre commented 10 years ago

About 2), I got some help on SO and was able to solve the problem. I committed the change on the "dev" branch. Can you quickly install it and let me know if it works for you too? So that I can then merge it with the "master" branch. (Retrieve the "dev" branch via this URL; then "unzip dev", and "cd eqtlbma-dev/; autoreconf ...", etc.)

IanMcDowell commented 10 years ago

I installed it and eqtlbma_hm now successfully can handle the merger of many input files.

On Mon, Mar 10, 2014 at 5:03 PM, Timothée Flutre notifications@github.comwrote:

About 2), I got some help on SOhttp://stackoverflow.com/a/22286767/597069and was able to solve the problem. I committed the change on the "dev" branch. Can you quickly install it and let me know if it works for you too? So that I can then merge it with the "master" branch. (Retrieve the "dev" branch via this URL https://github.com/timflutre/eqtlbma/archive/dev.zip; then "unzip dev", and "cd eqtlbma-dev/; autoreconf ...", etc.)

Reply to this email directly or view it on GitHubhttps://github.com/timflutre/eqtlbma/issues/2#issuecomment-37234008 .

timflutre commented 10 years ago

Thanks for your contribution on this: I just merged the dev branch to the master branch. I'll now close the issue, but re-open it if necessary.

IanMcDowell commented 10 years ago

Hi Tim,

--error hybrid now works okay for me. You may want to change default --maf from 0.0 to something minimal but non-zero, like 0.01, otherwise, 'nan' will be reported in cells that will cause eqtlbma_hm to fault.

Thanks

timflutre / eqtlbma

hm reports 'nan' for all parameters at every iteration #2