Closed iqbal-lab closed 6 years ago
Here's the build report:
{
"start_time": "1516974116",
"end_time": "1517046657",
"total_runtime": 72541,
"prg_build_report": {
"command": "perl /Net/fs1/home/zam/dev/git/gramtools_virtualenv/lib/python3.5/site-packages/gramtools/utils/vcf_to_linear_prg.pl --outfile ./gram-all/prg --vcf /data2/users/zam/
analyses/2018/0122_test_gramtools_on_human/human_1000g_ALL.vcf --ref /data2/users/zam/analyses/2018/0122_test_gramtools_on_human/Homo_sapiens.GRCh37.60.dna.WHOLE_GENOME.fa",
"return_value_is_0": true,
"stdout": [
"Finished printing linear PRG. Final number in alphabet is 158548838"
]
},
"gramtools_cpp_build": {
"command": "/Net/fs1/home/zam/dev/git/gramtools_virtualenv/lib/python3.5/site-packages/gramtools/bin/gram build --gram ./gram-all --kmer-size 15 --max-read-size 150 --debug",
"return_value_is_0": false,
"stdout": [
"Executing build command",
"Generating integer encoded PRG",
"Number of charecters in integer encoded linear PRG: 3455590684",
"Generating FM-Index",
"Generating PRG masks",
"Generating kmer index"
]
},
"current_working_directory": "/data2/users/zam/analyses/2018/0122_test_gramtools_on_human",
"paths": {
"encoded_prg": "./gram-all/encoded_prg",
"kmer_index": "./gram-all/kmers/kmer_index_15",
"reference": "/data2/users/zam/analyses/2018/0122_test_gramtools_on_human/Homo_sapiens.GRCh37.60.dna.WHOLE_GENOME.fa",
"perl_generated_fa": "./gram-all/perl_generated_fa",
"project": "./gram-all",
"vcf": "/data2/users/zam/analyses/2018/0122_test_gramtools_on_human/human_1000g_ALL.vcf",
"perl_generated_vcf": "./gram-all/perl_generated_vcf",
"fm_index": "./gram-all/fm_index",
"variant_site_mask": "./gram-all/variant_site_mask",
"allele_mask": "./gram-all/allele_mask",
"build_report": "./gram-all/build_report.json",
"prg": "./gram-all/prg"
},
"path_hashes": {
"prg": "469bdd5ca3a78f30113956045d6fdd081aed09769b69603e5e1e2c55fca7348d",
"fm_index": "ffd2c9c7c4dd5ac8401fc12fefb8addee553e17d22d22712b38bb2e8c6883bde",
"variant_site_mask": "d10899d3f84b2cd6ec9ce65c8ada59fffbb2da320fb96954bf99f26a82706034",
"encoded_prg": "579a3cbd44f5c7bbce8ceeb0806707a7db03962322ccbd94c1d272ed8a090ae4",
"allele_mask": "f9b4482efc31bf9979d835a4ed120fbb62893d4ccf40f9c8b294a396d42e548c",
"reference": "37ec37a464033d63f3332b4f2ecc615055e0cf8b463e99c88ecfbfda3befe521",
"perl_generated_fa": "1c49ba82697767847a3bbb19aec247f4d1ae77ebdeb9676437ab0f7984f54b0e",
"vcf": "311ad432b2da5bdd719b536313fa1fc48833c8adf3efcc6a7ee8ed3fe019973b",
"perl_generated_vcf": "13b4634203628ca979e9f2a4028fefbcfe93961e2a3014c81102a79d3c1d5a82"
},
"version_report": {
"version_number": "0.5.0",
"last_git_commit_hash": "7a53428da84d096805ee51679b60d376cc588cb2",
"current_git_branch": "master",
"truncated_git_commits": [
"7a53428 - Robyn Ffrancon, 4 minutes ago : disable unit tests which depend on unstable ordered data structure",
"31d8c13 - Robyn Ffrancon, 20 minutes ago : remove biopython dependancy",
"be2838b - Robyn Ffrancon, 3 hours ago : updated install instructions given new python3 functionality",
"addd790 - Robyn Ffrancon, 4 hours ago : added instructions for installing without root",
"26094ea - Robyn Ffrancon, 21 hours ago : fix: gram executable called from gramtools python module with correct enviroment variables for finding libraries"
]
}
}
For the record it worked with the reduced PRG (all variants above 5% frequency)
Just to be clear, I recall you saying that it ran on a machine with 1Tb of RAM. Does the program have access to all of that RAM? Is it competing with other programs?
It was competing, but over 300Gb of RAM was free. Rachel was using 200-400Gb of RAM, Jerome was using 200Gb RAM
Also dmesg did not have any messages from the OS about oom-killer etc
The way forward with this issue is to monitor and profile memory usage. std::bad_alloc error normally occurs when the program requests more memory than the system can afford to give.
yep. let's try again at EBI, on a non-shared machine as a start
This error occurred because the process could not allocate additional memory. It occurred whilst generating the kmer index with a kmer size of 15. Since this issue was last updated, the default kmer size has been reduced to 5. Other memory optimizations have also been implemented.
I'm closing this issue for now because I don't believe this occurred as a result of a code bug.
Building PRG for 1000 Genomes dataset (all variants)