iqbal-lab-org / gramtools

Genome inference from a population reference graph
MIT License
92 stars 15 forks source link

exit from gramtools build with non-zero exit code #133

Closed iqbal-lab closed 6 years ago

iqbal-lab commented 6 years ago
$ pwd
/nfs/research1/zi/projects/gramtools/nonstandard_datasets/mhc/dilthey1

Command

bsub.py 20 mhc10 gramtools build --gram-directory ./gram_k10 --vcf mhc.vcf --reference ref.fa  --max-read-length 150 --kmer-size 10 --debug
gramtools --version
{
    "version_number": "0.5.0",
    "last_git_commit_hash": "12ab776c2452aacd6c8a2248c834907b62b93f66",
    "truncated_git_commits": [
        "12ab776 - Robyn Ffrancon, 1528306963 : enhancement: added kmer size to stdout status message",
        "42018b4 - Robyn Ffrancon, 1528305111 : fix: when generating all kmers, correct ordering assured; increased kmer size threshold: <= 10",
        "72786ae - Robyn Ffrancon, 1528213271 : fix: travis build log within memory limit",
        "e440cc1 - Robyn Ffrancon, 1528202882 : enhancement: note added to stdout to explain that read counts include reverse complement",
        "58608bf - Robyn Ffrancon, 1528201130 : enhancement: infer command can produce vcf output if build used vcf + reference"
    ]
}

Takes 11 seconds, the perl script runs to build the prg linear string, and then gramtools fails but with no helpful error - I'm not sure what happened

cat gram_k10/build_report.json 
{
    "start_time": "1528454130",
    "end_time": "1528454142",
    "total_runtime": 12,
    "return_value_is_0": false,
    "prg_build_report": {
        "command": "perl /nfs/research1/zi/software/gramtools_tip_20180606/lib/python3.6/site-packages/gramtools/utils/vcf_to_linear_prg.pl --outfile ./gram_k10/prg --vcf /nfs/research1/zi/projects/gramtools/nonstandard_datasets/mhc/dilthey1/mhc.vcf --ref /nfs/research1/zi/projects/gramtools/nonstandard_datasets/mhc/dilthey1/ref.fa",
        "return_value_is_0": true,
        "stdout": [
            "Finished printing linear PRG. Final number in alphabet is  23690"
        ]
    },
    "gramtools_cpp_build": {
        "command": "/nfs/research1/zi/software/gramtools_tip_20180606/lib/python3.6/site-packages/gramtools/bin/gram build --gram ./gram_k10 --kmer-size 10 --max-read-size 150 --max-threads 1 --debug",
        "return_value_is_0": false,
        "stdout": [
            "maximum thread count: 1",
            "Executing build command",
            "Generating integer encoded PRG"
        ]
    },
    "kmer_size": 10,
    "max_read_length": 150,
    "current_working_directory": "/nfs/research1/zi/projects/gramtools/nonstandard_datasets/mhc/dilthey1",
    "paths": {
        "project": "./gram_k10",
        "vcf": "/nfs/research1/zi/projects/gramtools/nonstandard_datasets/mhc/dilthey1/mhc.vcf",
        "reference": "/nfs/research1/zi/projects/gramtools/nonstandard_datasets/mhc/dilthey1/ref.fa",
        "prg": "./gram_k10/prg",
        "encoded_prg": "./gram_k10/encoded_prg",
        "variant_site_mask": "./gram_k10/variant_site_mask",
        "allele_mask": "./gram_k10/allele_mask",
        "fm_index": "./gram_k10/fm_index",
        "perl_generated_vcf": "./gram_k10/perl_generated_vcf",
        "perl_generated_fa": "./gram_k10/perl_generated_fa",
        "build_report": "./gram_k10/build_report.json"
    },
    "path_hashes": {
        "vcf": "bffc7acae3e03e7e8a374ca5e924ff7a46e6b4c80d1326f47b60e5f9b09987bb",
        "reference": "e2f96fa99bad5bcb6fbd9be34622997b953fc41f330757b20e2dd80826eb2cc6",
        "prg": "a23ef2b8a530c5ae7c762b9118b7be8ae7c60c576a8fd2d65faf228949565421",
        "perl_generated_vcf": "bffc7acae3e03e7e8a374ca5e924ff7a46e6b4c80d1326f47b60e5f9b09987bb",
        "perl_generated_fa": "c98b69fc51de4f1e12fec9f9ed78f6d4db0da57f1f4b2c7d6dbcf5abc7c76208"
    },
    "version_report": {
        "version_number": "0.5.0",
        "last_git_commit_hash": "12ab776c2452aacd6c8a2248c834907b62b93f66",
        "truncated_git_commits": [
            "12ab776 - Robyn Ffrancon, 1528306963 : enhancement: added kmer size to stdout status message",
            "42018b4 - Robyn Ffrancon, 1528305111 : fix: when generating all kmers, correct ordering assured; increased kmer size threshold: <= 10",
            "72786ae - Robyn Ffrancon, 1528213271 : fix: travis build log within memory limit",
            "e440cc1 - Robyn Ffrancon, 1528202882 : enhancement: note added to stdout to explain that read counts include reverse complement",
            "58608bf - Robyn Ffrancon, 1528201130 : enhancement: infer command can produce vcf output if build used vcf + reference"
        ]
    }
}(gramtools_tip_20180606) ebi-cli-001.ebi.ac.uk> 
iqbal-lab commented 6 years ago

By the way I have run other gramtools builds with the same commit that worked fine

ffranr commented 6 years ago

I couldn't reproduce this bug on my laptop. I will continue investigating. This is the build report that I get:

{
    "start_time": "1528706198",
    "end_time": "1528706381",
    "total_runtime": 183,
    "return_value_is_0": true,
    "prg_build_report": {
        "command": "perl /home/rffrancon/Documents/gramtools/gramtools/utils/vcf_to_linear_prg.pl --outfile ./gram/prg --vcf /home/rffrancon/data/tmp/mhc.vcf --ref /home/rffrancon/data/tmp/ref.fa",
        "return_value_is_0": true,
        "stdout": [
            "Finished printing linear PRG. Final number in alphabet is  23690"
        ]
    },
    "gramtools_cpp_build": {
        "command": "/home/rffrancon/Documents/gramtools/gramtools/bin/gram build --gram ./gram --kmer-size 10 --max-read-size 150 --max-threads 1",
        "return_value_is_0": true,
        "stdout": [
            "maximum thread count: 1",
            "Executing build command",
            "Generating integer encoded PRG",
            "Number of charecters in integer encoded linear PRG: 27269526",
            "Maximum alphabet character: 23690",
            "Generating FM-Index",
            "Generating PRG masks",
            "Building kmer index (kmer size: 10)",
            "Getting all kmers",
            "Getting kmer prefix diffs",
            "Indexing kmers",
            "Total number of unique kmers: 1048576",
            "",
            "Progress: 50000 of 1048576",
            "Progress: 100000 of 1048576",
            "Progress: 150000 of 1048576",
            "Progress: 200000 of 1048576",
            "Progress: 250000 of 1048576",
            "Progress: 300000 of 1048576",
            "Progress: 350000 of 1048576",
            "Progress: 400000 of 1048576",
            "Progress: 450000 of 1048576",
            "Progress: 500000 of 1048576",
            "Progress: 550000 of 1048576",
            "Progress: 600000 of 1048576",
            "Progress: 650000 of 1048576",
            "Progress: 700000 of 1048576",
            "Progress: 750000 of 1048576",
            "Progress: 800000 of 1048576",
            "Progress: 850000 of 1048576",
            "Progress: 900000 of 1048576",
            "Progress: 950000 of 1048576",
            "Progress: 1000000 of 1048576",
            "",
            "Timer report:",
            "                       seconds",
            "         Encoded PRG      1.13",
            "   Generate FM-Index    111.67",
            "Generating PRG masks     56.71",
            " Building kmer index      6.82",
            "",
            "Total elapsed time: 176.33"
        ]
    },
    "kmer_size": 10,
    "max_read_length": 150,
    "current_working_directory": "/home/rffrancon/data/tmp",
    "paths": {
        "project": "./gram",
        "vcf": "/home/rffrancon/data/tmp/mhc.vcf",
        "reference": "/home/rffrancon/data/tmp/ref.fa",
        "prg": "./gram/prg",
        "encoded_prg": "./gram/encoded_prg",
        "variant_site_mask": "./gram/variant_site_mask",
        "allele_mask": "./gram/allele_mask",
        "fm_index": "./gram/fm_index",
        "perl_generated_vcf": "./gram/perl_generated_vcf",
        "perl_generated_fa": "./gram/perl_generated_fa",
        "build_report": "./gram/build_report.json"
    },
    "path_hashes": {
        "vcf": "bffc7acae3e03e7e8a374ca5e924ff7a46e6b4c80d1326f47b60e5f9b09987bb",
        "reference": "e2f96fa99bad5bcb6fbd9be34622997b953fc41f330757b20e2dd80826eb2cc6",
        "prg": "a23ef2b8a530c5ae7c762b9118b7be8ae7c60c576a8fd2d65faf228949565421",
        "encoded_prg": "c0e139694c6cbeaa3d53a86383df0c97894c28a557316b4de8f99eea6cf2d2b0",
        "variant_site_mask": "8907a743d01d39c62067659d85293451c976dcace410492a37458c38cf61a88b",
        "allele_mask": "9183cce0c73eafcf811e9a41be2409ac00ab52db812cc847e96fbd8a8e012681",
        "fm_index": "aa9eb2fd5d7abe8bc8f7e39b977dc97916742321a5c6c45d6f5751dea4965ae0",
        "perl_generated_vcf": "bffc7acae3e03e7e8a374ca5e924ff7a46e6b4c80d1326f47b60e5f9b09987bb",
        "perl_generated_fa": "c98b69fc51de4f1e12fec9f9ed78f6d4db0da57f1f4b2c7d6dbcf5abc7c76208",
        "build_report": "2171501a329b4ba321374aace0a34b808995df1f1f32be7e2d47b544f46139d8"
    },
    "version_report": {
        "version_number": "0.5.0",
        "last_git_commit_hash": "12ab776c2452aacd6c8a2248c834907b62b93f66",
        "truncated_git_commits": [
            "12ab776 - Robyn Ffrancon, 1528306963 : enhancement: added kmer size to stdout status message",
            "42018b4 - Robyn Ffrancon, 1528305111 : fix: when generating all kmers, correct ordering assured; increased kmer size threshold: <= 10",
            "72786ae - Robyn Ffrancon, 1528213271 : fix: travis build log within memory limit",
            "e440cc1 - Robyn Ffrancon, 1528202882 : enhancement: note added to stdout to explain that read counts include reverse complement",
            "58608bf - Robyn Ffrancon, 1528201130 : enhancement: infer command can produce vcf output if build used vcf + reference"
        ]
    }
}
ffranr commented 6 years ago

Similar output when running on ebi-cli. Can't reproduce this bug. @iqbal-lab can you reproduce this?

Build report from ebi-cli:

(gram_venv) [rff@ebi-login-002 tmp]$ cat gram_k10/build_report.json 
{
    "start_time": "1528712877",
    "end_time": "1528713070",
    "total_runtime": 193,
    "return_value_is_0": true,
    "prg_build_report": {
        "command": "perl /homes/rff/gram_venv/lib/python3.6/site-packages/gramtools/utils/vcf_to_linear_prg.pl --outfile ./gram_k10/prg --vcf /homes/rff/tmp/mhc.vcf --ref /homes/rff/tmp/ref.fa",
        "return_value_is_0": true,
        "stdout": [
            "Finished printing linear PRG. Final number in alphabet is  23690"
        ]
    },
    "gramtools_cpp_build": {
        "command": "/homes/rff/gram_venv/lib/python3.6/site-packages/gramtools/bin/gram build --gram ./gram_k10 --kmer-size 10 --max-read-size 150 --max-threads 1 --debug",
        "return_value_is_0": true,
        "stdout": [
            "maximum thread count: 1",
            "Executing build command",
            "Generating integer encoded PRG",
            "Number of charecters in integer encoded linear PRG: 27269526",
            "Maximum alphabet character: 23690",
            "Generating FM-Index",
            "Generating PRG masks",
            "Building kmer index (kmer size: 10)",
            "Getting all kmers",
            "Getting kmer prefix diffs",
            "Indexing kmers",
            "Total number of unique kmers: 1048576",
            "",
            "Progress: 50000 of 1048576",
            "Progress: 100000 of 1048576",
            "Progress: 150000 of 1048576",
            "Progress: 200000 of 1048576",
            "Progress: 250000 of 1048576",
            "Progress: 300000 of 1048576",
            "Progress: 350000 of 1048576",
            "Progress: 400000 of 1048576",
            "Progress: 450000 of 1048576",
            "Progress: 500000 of 1048576",
            "Progress: 550000 of 1048576",
            "Progress: 600000 of 1048576",
            "Progress: 650000 of 1048576",
            "Progress: 700000 of 1048576",
            "Progress: 750000 of 1048576",
            "Progress: 800000 of 1048576",
            "Progress: 850000 of 1048576",
            "Progress: 900000 of 1048576",
            "Progress: 950000 of 1048576",
            "Progress: 1000000 of 1048576",
            "",
            "Timer report:",
            "                       seconds",
            "         Encoded PRG      1.25",
            "   Generate FM-Index     103.2",
            "Generating PRG masks     66.52",
            " Building kmer index      7.31",
            "",
            "Total elapsed time: 178.28"
        ]
    },
    "kmer_size": 10,
    "max_read_length": 150,
    "current_working_directory": "/homes/rff/tmp",
    "paths": {
        "project": "./gram_k10",
        "vcf": "/homes/rff/tmp/mhc.vcf",
        "reference": "/homes/rff/tmp/ref.fa",
        "prg": "./gram_k10/prg",
        "encoded_prg": "./gram_k10/encoded_prg",
        "variant_site_mask": "./gram_k10/variant_site_mask",
        "allele_mask": "./gram_k10/allele_mask",
        "fm_index": "./gram_k10/fm_index",
        "perl_generated_vcf": "./gram_k10/perl_generated_vcf",
        "perl_generated_fa": "./gram_k10/perl_generated_fa",
        "build_report": "./gram_k10/build_report.json"
    },
    "path_hashes": {
        "vcf": "bffc7acae3e03e7e8a374ca5e924ff7a46e6b4c80d1326f47b60e5f9b09987bb",
        "reference": "e2f96fa99bad5bcb6fbd9be34622997b953fc41f330757b20e2dd80826eb2cc6",
        "prg": "a23ef2b8a530c5ae7c762b9118b7be8ae7c60c576a8fd2d65faf228949565421",
        "encoded_prg": "c0e139694c6cbeaa3d53a86383df0c97894c28a557316b4de8f99eea6cf2d2b0",
        "variant_site_mask": "8907a743d01d39c62067659d85293451c976dcace410492a37458c38cf61a88b",
        "allele_mask": "9183cce0c73eafcf811e9a41be2409ac00ab52db812cc847e96fbd8a8e012681",
        "fm_index": "aa9eb2fd5d7abe8bc8f7e39b977dc97916742321a5c6c45d6f5751dea4965ae0",
        "perl_generated_vcf": "bffc7acae3e03e7e8a374ca5e924ff7a46e6b4c80d1326f47b60e5f9b09987bb",
        "perl_generated_fa": "c98b69fc51de4f1e12fec9f9ed78f6d4db0da57f1f4b2c7d6dbcf5abc7c76208"
    },
    "version_report": {
        "version_number": "0.5.0",
        "last_git_commit_hash": "12ab776c2452aacd6c8a2248c834907b62b93f66",
        "truncated_git_commits": [
            "12ab776 - Robyn Ffrancon, 1528306963 : enhancement: added kmer size to stdout status message",
            "42018b4 - Robyn Ffrancon, 1528305111 : fix: when generating all kmers, correct ordering assured; increased kmer size threshold: <= 10",
            "72786ae - Robyn Ffrancon, 1528213271 : fix: travis build log within memory limit",
            "e440cc1 - Robyn Ffrancon, 1528202882 : enhancement: note added to stdout to explain that read counts include reverse complement",
            "58608bf - Robyn Ffrancon, 1528201130 : enhancement: infer command can produce vcf output if build used vcf + reference"
        ]
    }
iqbal-lab commented 6 years ago

I cant reproduce either