Closed iqbal-lab closed 6 years ago
By the way I have run other gramtools builds with the same commit that worked fine
I couldn't reproduce this bug on my laptop. I will continue investigating. This is the build report that I get:
{
"start_time": "1528706198",
"end_time": "1528706381",
"total_runtime": 183,
"return_value_is_0": true,
"prg_build_report": {
"command": "perl /home/rffrancon/Documents/gramtools/gramtools/utils/vcf_to_linear_prg.pl --outfile ./gram/prg --vcf /home/rffrancon/data/tmp/mhc.vcf --ref /home/rffrancon/data/tmp/ref.fa",
"return_value_is_0": true,
"stdout": [
"Finished printing linear PRG. Final number in alphabet is 23690"
]
},
"gramtools_cpp_build": {
"command": "/home/rffrancon/Documents/gramtools/gramtools/bin/gram build --gram ./gram --kmer-size 10 --max-read-size 150 --max-threads 1",
"return_value_is_0": true,
"stdout": [
"maximum thread count: 1",
"Executing build command",
"Generating integer encoded PRG",
"Number of charecters in integer encoded linear PRG: 27269526",
"Maximum alphabet character: 23690",
"Generating FM-Index",
"Generating PRG masks",
"Building kmer index (kmer size: 10)",
"Getting all kmers",
"Getting kmer prefix diffs",
"Indexing kmers",
"Total number of unique kmers: 1048576",
"",
"Progress: 50000 of 1048576",
"Progress: 100000 of 1048576",
"Progress: 150000 of 1048576",
"Progress: 200000 of 1048576",
"Progress: 250000 of 1048576",
"Progress: 300000 of 1048576",
"Progress: 350000 of 1048576",
"Progress: 400000 of 1048576",
"Progress: 450000 of 1048576",
"Progress: 500000 of 1048576",
"Progress: 550000 of 1048576",
"Progress: 600000 of 1048576",
"Progress: 650000 of 1048576",
"Progress: 700000 of 1048576",
"Progress: 750000 of 1048576",
"Progress: 800000 of 1048576",
"Progress: 850000 of 1048576",
"Progress: 900000 of 1048576",
"Progress: 950000 of 1048576",
"Progress: 1000000 of 1048576",
"",
"Timer report:",
" seconds",
" Encoded PRG 1.13",
" Generate FM-Index 111.67",
"Generating PRG masks 56.71",
" Building kmer index 6.82",
"",
"Total elapsed time: 176.33"
]
},
"kmer_size": 10,
"max_read_length": 150,
"current_working_directory": "/home/rffrancon/data/tmp",
"paths": {
"project": "./gram",
"vcf": "/home/rffrancon/data/tmp/mhc.vcf",
"reference": "/home/rffrancon/data/tmp/ref.fa",
"prg": "./gram/prg",
"encoded_prg": "./gram/encoded_prg",
"variant_site_mask": "./gram/variant_site_mask",
"allele_mask": "./gram/allele_mask",
"fm_index": "./gram/fm_index",
"perl_generated_vcf": "./gram/perl_generated_vcf",
"perl_generated_fa": "./gram/perl_generated_fa",
"build_report": "./gram/build_report.json"
},
"path_hashes": {
"vcf": "bffc7acae3e03e7e8a374ca5e924ff7a46e6b4c80d1326f47b60e5f9b09987bb",
"reference": "e2f96fa99bad5bcb6fbd9be34622997b953fc41f330757b20e2dd80826eb2cc6",
"prg": "a23ef2b8a530c5ae7c762b9118b7be8ae7c60c576a8fd2d65faf228949565421",
"encoded_prg": "c0e139694c6cbeaa3d53a86383df0c97894c28a557316b4de8f99eea6cf2d2b0",
"variant_site_mask": "8907a743d01d39c62067659d85293451c976dcace410492a37458c38cf61a88b",
"allele_mask": "9183cce0c73eafcf811e9a41be2409ac00ab52db812cc847e96fbd8a8e012681",
"fm_index": "aa9eb2fd5d7abe8bc8f7e39b977dc97916742321a5c6c45d6f5751dea4965ae0",
"perl_generated_vcf": "bffc7acae3e03e7e8a374ca5e924ff7a46e6b4c80d1326f47b60e5f9b09987bb",
"perl_generated_fa": "c98b69fc51de4f1e12fec9f9ed78f6d4db0da57f1f4b2c7d6dbcf5abc7c76208",
"build_report": "2171501a329b4ba321374aace0a34b808995df1f1f32be7e2d47b544f46139d8"
},
"version_report": {
"version_number": "0.5.0",
"last_git_commit_hash": "12ab776c2452aacd6c8a2248c834907b62b93f66",
"truncated_git_commits": [
"12ab776 - Robyn Ffrancon, 1528306963 : enhancement: added kmer size to stdout status message",
"42018b4 - Robyn Ffrancon, 1528305111 : fix: when generating all kmers, correct ordering assured; increased kmer size threshold: <= 10",
"72786ae - Robyn Ffrancon, 1528213271 : fix: travis build log within memory limit",
"e440cc1 - Robyn Ffrancon, 1528202882 : enhancement: note added to stdout to explain that read counts include reverse complement",
"58608bf - Robyn Ffrancon, 1528201130 : enhancement: infer command can produce vcf output if build used vcf + reference"
]
}
}
Similar output when running on ebi-cli. Can't reproduce this bug. @iqbal-lab can you reproduce this?
Build report from ebi-cli:
(gram_venv) [rff@ebi-login-002 tmp]$ cat gram_k10/build_report.json
{
"start_time": "1528712877",
"end_time": "1528713070",
"total_runtime": 193,
"return_value_is_0": true,
"prg_build_report": {
"command": "perl /homes/rff/gram_venv/lib/python3.6/site-packages/gramtools/utils/vcf_to_linear_prg.pl --outfile ./gram_k10/prg --vcf /homes/rff/tmp/mhc.vcf --ref /homes/rff/tmp/ref.fa",
"return_value_is_0": true,
"stdout": [
"Finished printing linear PRG. Final number in alphabet is 23690"
]
},
"gramtools_cpp_build": {
"command": "/homes/rff/gram_venv/lib/python3.6/site-packages/gramtools/bin/gram build --gram ./gram_k10 --kmer-size 10 --max-read-size 150 --max-threads 1 --debug",
"return_value_is_0": true,
"stdout": [
"maximum thread count: 1",
"Executing build command",
"Generating integer encoded PRG",
"Number of charecters in integer encoded linear PRG: 27269526",
"Maximum alphabet character: 23690",
"Generating FM-Index",
"Generating PRG masks",
"Building kmer index (kmer size: 10)",
"Getting all kmers",
"Getting kmer prefix diffs",
"Indexing kmers",
"Total number of unique kmers: 1048576",
"",
"Progress: 50000 of 1048576",
"Progress: 100000 of 1048576",
"Progress: 150000 of 1048576",
"Progress: 200000 of 1048576",
"Progress: 250000 of 1048576",
"Progress: 300000 of 1048576",
"Progress: 350000 of 1048576",
"Progress: 400000 of 1048576",
"Progress: 450000 of 1048576",
"Progress: 500000 of 1048576",
"Progress: 550000 of 1048576",
"Progress: 600000 of 1048576",
"Progress: 650000 of 1048576",
"Progress: 700000 of 1048576",
"Progress: 750000 of 1048576",
"Progress: 800000 of 1048576",
"Progress: 850000 of 1048576",
"Progress: 900000 of 1048576",
"Progress: 950000 of 1048576",
"Progress: 1000000 of 1048576",
"",
"Timer report:",
" seconds",
" Encoded PRG 1.25",
" Generate FM-Index 103.2",
"Generating PRG masks 66.52",
" Building kmer index 7.31",
"",
"Total elapsed time: 178.28"
]
},
"kmer_size": 10,
"max_read_length": 150,
"current_working_directory": "/homes/rff/tmp",
"paths": {
"project": "./gram_k10",
"vcf": "/homes/rff/tmp/mhc.vcf",
"reference": "/homes/rff/tmp/ref.fa",
"prg": "./gram_k10/prg",
"encoded_prg": "./gram_k10/encoded_prg",
"variant_site_mask": "./gram_k10/variant_site_mask",
"allele_mask": "./gram_k10/allele_mask",
"fm_index": "./gram_k10/fm_index",
"perl_generated_vcf": "./gram_k10/perl_generated_vcf",
"perl_generated_fa": "./gram_k10/perl_generated_fa",
"build_report": "./gram_k10/build_report.json"
},
"path_hashes": {
"vcf": "bffc7acae3e03e7e8a374ca5e924ff7a46e6b4c80d1326f47b60e5f9b09987bb",
"reference": "e2f96fa99bad5bcb6fbd9be34622997b953fc41f330757b20e2dd80826eb2cc6",
"prg": "a23ef2b8a530c5ae7c762b9118b7be8ae7c60c576a8fd2d65faf228949565421",
"encoded_prg": "c0e139694c6cbeaa3d53a86383df0c97894c28a557316b4de8f99eea6cf2d2b0",
"variant_site_mask": "8907a743d01d39c62067659d85293451c976dcace410492a37458c38cf61a88b",
"allele_mask": "9183cce0c73eafcf811e9a41be2409ac00ab52db812cc847e96fbd8a8e012681",
"fm_index": "aa9eb2fd5d7abe8bc8f7e39b977dc97916742321a5c6c45d6f5751dea4965ae0",
"perl_generated_vcf": "bffc7acae3e03e7e8a374ca5e924ff7a46e6b4c80d1326f47b60e5f9b09987bb",
"perl_generated_fa": "c98b69fc51de4f1e12fec9f9ed78f6d4db0da57f1f4b2c7d6dbcf5abc7c76208"
},
"version_report": {
"version_number": "0.5.0",
"last_git_commit_hash": "12ab776c2452aacd6c8a2248c834907b62b93f66",
"truncated_git_commits": [
"12ab776 - Robyn Ffrancon, 1528306963 : enhancement: added kmer size to stdout status message",
"42018b4 - Robyn Ffrancon, 1528305111 : fix: when generating all kmers, correct ordering assured; increased kmer size threshold: <= 10",
"72786ae - Robyn Ffrancon, 1528213271 : fix: travis build log within memory limit",
"e440cc1 - Robyn Ffrancon, 1528202882 : enhancement: note added to stdout to explain that read counts include reverse complement",
"58608bf - Robyn Ffrancon, 1528201130 : enhancement: infer command can produce vcf output if build used vcf + reference"
]
}
I cant reproduce either
Command
Takes 11 seconds, the perl script runs to build the prg linear string, and then gramtools fails but with no helpful error - I'm not sure what happened