Closed adamnovak closed 7 years ago
vg chunk got fiddled with between Sep 14-19th. i'd expect to see errors like this when mixing a toil-vg that's more recent than this date range with a vg that is not (or vice versa). this is most likely the problem here.
On Mon, Oct 30, 2017 at 2:08 PM, Adam Novak notifications@github.com wrote:
Apparently we're getting non-GAM data into the GAM input in vg index when indexing aligned reads to split into chromosome chunks.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/vgteam/toil-vg/issues/349#issuecomment-340534386, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2_7rRIQ7vW0ERurFFoOUhSMgd41XvDks5sxhCrgaJpZM4PrCxf .
@glennhickey @adamnovak
For one of my runs, I'm getting this error as a result of using an older version of vg and toil-vg. I believe the version of toil-vg dates to around august 26th and the version of vg is v1.5.0-824-gcc531354-t73-run
which I believe dates to around august 29th.
It makes no sense because I ran this version before and those runs ran to completion. They also were recently tested on my tiny test and that appears to have passed. And if it does have something to do with the version of toil-vg, which is possible since I may have updated toil-vg in that virtualenv, why wouldn't the unit tests catch that error?
I'm going to double check this on my tiny tests. The fact that the only difference between this particular run and my previous runs is the flat graph index that I constructed, I may look to that as my potential source for error. It's possible I'll need to reconstruct that xg index.
It's possible that the best practices on constructing graphs, as posted on the vg wiki (https://github.com/vgteam/vg/wiki/working-with-a-whole-genome-variation-graph#indexing-with-xg-and-gcsa2) are out of date since I followed those pruning steps verbatim.
exact dates can be found via quay and github.
anyway, it sounds like you ran an experiment with vg version X and toil-vg version Y. you've since upgraded to Y' and have no idea what Y is (or, even Y') and are frustrated that you can't reproduce.
there's no easy way to know Y for sure. toil-vg can be improved to include better tracability information in the logs (#380) but ultimately you're going to have to change your scripts/habits to keep track of the tool versions (at a minimum, vg docker image url, toil version, and toil-vg commit id) and command lines you used for your work to be reproducible.
It looked like Charlie was logging the Docker image name/version, I think? I think we already concluded this file was (supposed to have been) generated by the same version of vg that is now failing to parse it.
Charlie, did you ever get that interactive node provisioned to see if you could parse the GAM with a manual run of vg? If you can't, I think wehave to work out a way to dig into the protobuf at a low level to see exactly what is wrong with the bytes it is trying to parse (or to try and guess what kind of protobuf objects we are actually looking at). The only real change to the Alignment record made around that time was https://github.com/vgteam/vg/commit/9a572c7a26b8ac7e5f997b39949ab635306a30eb#diff-93956404fc08bc326719b7a5300499ad but we should be fine reading the old records with the new format, and we should get an unknown field type record reading the new records with the old format. The error that does happen (about nonsense characters in strings) really does make this sound like we're reading records of the wrong protobuf type entirely.
Charlie, you didn't accidentally do multipath alignments somehow, did you?
On Wed, Nov 1, 2017 at 7:25 AM, Glenn Hickey notifications@github.com wrote:
exact dates can be found via quay and github.
anyway, it sounds like you ran an experiment with vg version X and toil-vg version Y. you've since upgraded to Y' and have no idea what Y is (or, even Y') and are frustrated that you can't reproduce.
there's no easy way to know Y for sure. toil-vg can be improved to include better tracability information in the logs (#380 https://github.com/vgteam/toil-vg/issues/380) but ultimately you're going to have to change your scripts/habits to keep track of the tool versions (at a minimum, vg docker image url, toil version, and toil-vg commit id) and command lines you used for your work to be reproducible.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/vgteam/toil-vg/issues/349#issuecomment-341121207, or mute the thread https://github.com/notifications/unsubscribe-auth/AE0_X5v6Dt98H0eZRP2MOSTJ-6Y5adypks5syH9dgaJpZM4PrCxf .
@adamnovak, the issue isn't with the vg version, it's with the toil-vg version. From the docker name, Charlie's running this vg from Aug 3rd.
https://github.com/vgteam/vg/commit/cc531354c2f5727155351e0304601b2759b5c803
This vg version will not work on recent versions of toil-vg because of changes to the vg chunk interface for gam splitting (which is where the crash is). This would be easy to check, but we don't know which toil-vg version Charlie's running. We can estimate the date from the default vg docker image in vg_config.py, which may be enough. Or Charlie could try running a toil-vg from August 3rd.
On Wed, Nov 1, 2017 at 1:05 PM, Adam Novak notifications@github.com wrote:
It looked like Charlie was logging the Docker image name/version, I think? I think we already concluded this file was (supposed to have been) generated by the same version of vg that is now failing to parse it.
Charlie, did you ever get that interactive node provisioned to see if you could parse the GAM with a manual run of vg? If you can't, I think wehave to work out a way to dig into the protobuf at a low level to see exactly what is wrong with the bytes it is trying to parse (or to try and guess what kind of protobuf objects we are actually looking at). The only real change to the Alignment record made around that time was https://github.com/vgteam/vg/commit/9a572c7a26b8ac7e5f997b39949ab6 35306a30eb#diff-93956404fc08bc326719b7a5300499ad but we should be fine reading the old records with the new format, and we should get an unknown field type record reading the new records with the old format. The error that does happen (about nonsense characters in strings) really does make this sound like we're reading records of the wrong protobuf type entirely.
Charlie, you didn't accidentally do multipath alignments somehow, did you?
On Wed, Nov 1, 2017 at 7:25 AM, Glenn Hickey notifications@github.com wrote:
exact dates can be found via quay and github.
anyway, it sounds like you ran an experiment with vg version X and toil-vg version Y. you've since upgraded to Y' and have no idea what Y is (or, even Y') and are frustrated that you can't reproduce.
there's no easy way to know Y for sure. toil-vg can be improved to include better tracability information in the logs (#380 https://github.com/vgteam/toil-vg/issues/380) but ultimately you're going to have to change your scripts/habits to keep track of the tool versions (at a minimum, vg docker image url, toil version, and toil-vg commit id) and command lines you used for your work to be reproducible.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/vgteam/toil-vg/issues/349#issuecomment-341121207, or mute the thread https://github.com/notifications/unsubscribe- auth/AE0_X5v6Dt98H0eZRP2MOSTJ-6Y5adypks5syH9dgaJpZM4PrCxf .
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/vgteam/toil-vg/issues/349#issuecomment-341170671, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2_7i_U6QbN8CCLAlY1JgoR_AQQUbS_ks5syKTHgaJpZM4PrCxf .
@glennhickey
The vg version in the vg_config.py file in the virtualenv that I used which produced this error is v1.5.0-804-g77cb624a-t69-run
. Would that version of toil-vg be incompatible?
@glennhickey @adamnovak I think I may have found the problem. I noticed a crucial difference between the 1st log file of my previously working run and the log file of my flat graph run.
The previous run has a log file that states that an alignment Process took 560732.747311 seconds
. My latest run using the flat-graph states that an alignment Process took 401590.563382 seconds with single-end vg-map
.
I noticed that for my previous run, I used a virtualenv that was created in august 4th to do the alignment and .gam indexing and i used the august 26th virtualenv to do the variant calling. I thought I used the same version of toil-vg (the august 26th version) to do the mapping, but that's not the case.
What I used in my flat-graph run was the august 26th version of toil-vg. Though since I've used the consistent version of toil-vg, there shouldn't be a problem. However, I noticed that the alignment was done in single-end vg-map
mode even though I supplied the toil-vg command with two fastq files.
Was there a change in the mapping interface where I need to explicitly state I wan't it to do paired-end mapping?
Great, that's enough to put you in the Aug. 1 - Sep. 6 range for toil-vg. That's about a week shy of the compatibility break, so this toil-vg should work with your Aug. 3 image after all. I stand corrected: carry on with the gdb!
It's moot now, but pip does seem to be giving unique versions so you should be able to get from the pypi version to the commit exactly (pip list -> then look up build number of jenkins page and map to pr)
On Wed, Nov 1, 2017 at 3:22 PM, Charles Markello notifications@github.com wrote:
@glennhickey https://github.com/glennhickey
The vg version in the vg_config.py file in the virtualenv that I used which produced this error is v1.5.0-804-g77cb624a-t69-run. Would that version of toil-vg be incompatible?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/vgteam/toil-vg/issues/349#issuecomment-341211869, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2_7qiunl7otiMmMaDOLvxZateOWAxKks5syMTygaJpZM4PrCxf .
It was probably doing single ended both times, as that wasn't fixed until Sep. 18th
On Wed, Nov 1, 2017 at 3:52 PM, Charles Markello notifications@github.com wrote:
@glennhickey https://github.com/glennhickey @adamnovak https://github.com/adamnovak I think I may have found the problem. I noticed a crucial difference between the 1st log file of my previously working run and the log file of my flat graph run.
The previous run has a log file that states that an alignment Process took 560732.747311 seconds. My latest run using the flat-graph states that an alignment Process took 401590.563382 seconds with single-end vg-map.
I noticed that for my previous run, I used a virtualenv that was created in august 4th to do the alignment and .gam indexing and i used the august 26th virtualenv to do the variant calling. I thought I used the same version of toil-vg (the august 26th version) to do the mapping, but that's not the case.
What I used in my flat-graph run was the august 26th version of toil-vg. Though since I've used the consistent version of toil-vg, there shouldn't be a problem. However, I noticed that the alignment was done in single-end vg-map mode even though I supplied the toil-vg command with two fastq files.
Was there a change in the mapping interface where I need to explicitly state I wan't it to do paired-end mapping?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/vgteam/toil-vg/issues/349#issuecomment-341220656, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2_7pAIbkBwZTT4s8XRkLk3ss1YsuHxks5syMwQgaJpZM4PrCxf .
@glennhickey But then how was the previously ran gam file indexable, but not the flat-graph one?
you can verify for your self they were both single ended by looking at the vg map commands.
On Wed, Nov 1, 2017 at 4:01 PM, Charles Markello notifications@github.com wrote:
@glennhickey https://github.com/glennhickey But then how was the previously ran gam file indexable, but not the flat-graph one?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/vgteam/toil-vg/issues/349#issuecomment-341222989, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2_7kxrgVFyQ6qAcpobwW41zxZ12kj6ks5syM4lgaJpZM4PrCxf .
So when I see the command Singularity Run: vg map -f reads_chunk_0_0.fq.gz -t 56 -x graph.vg.xg -g graph.vg.gcsa
Does reads_chunk_0_0.fq.gz
correspond to the 1st read file ${READS_1}
passed in my toil-vg command
toil-vg run --logDebug --retryCount 3 --cleanWorkDir never --realTimeLogging --logInfo ${TEST_JOBSTORE} ${SAMPLE} ${OUT_STORE} --config ${CONFIG_FILE} --workDir ${WORK_DIR} --container Singularity --single_reads_chunk --gcsa_index ${GCSA_INDEX} --xg_index ${XG_INDEX} --id_ranges ${ID_RANGES} --fastq ${READS_1} ${READS_2} 2> /home/markellocj/mapcall.allreads.${SAMPLE}.wgs.cpu56.${ATTEMPT_NUM}.log
or is reads_chunk_0_0.fq.gz
some chunked interleaved file of ${READS_1}
${READS_2}
?
@glennhickey Also, I get the same problem when I run the Eric mapper using a version of toil-vg that has vg container version v1.5.0-1666-g0ee5778e-t107-run
in the vg_config.py file. The mapping completes with Process took 258247.503241 seconds with single-end vg-map
and the same .gam indexing error of what(): obsolete, invalid, or corrupt protobuf input
. I used the v1.5.0_1557
version of vg to create the .xg index for that run.
There was a bug from May 19 to Sep. 18th where the --single_reads_chunk option caused multiple --fastq inputs to be aligned independently single-ended. Your commands confirm this (toil-vg is saying paired but vg map is single ended).
There is some irony in the fact that the paired-end mapper in vg map was so broken for much of this period that the single-ended results were often better.
On Wed, Nov 1, 2017 at 4:15 PM, Charles Markello notifications@github.com wrote:
So when I see the command Singularity Run: vg map -f reads_chunk_0_0.fq.gz -t 56 -x graph.vg.xg -g graph.vg.gcsa
Does reads_chunk_0_0.fq.gz correspond to the 1st read file ${READS_1} passed in my toil-vg command
toil-vg run --logDebug --retryCount 3 --cleanWorkDir never --realTimeLogging --logInfo ${TEST_JOBSTORE} ${SAMPLE} ${OUT_STORE} --config ${CONFIG_FILE} --workDir ${WORK_DIR} --container Singularity --single_reads_chunk --gcsa_index ${GCSA_INDEX} --xg_index ${XG_INDEX} --id_ranges ${ID_RANGES} --fastq ${READS_1} ${READS_2} 2> /home/markellocj/mapcall.allreads.${SAMPLE}.wgs.cpu56.${ATTEMPT_NUM}.log
or is reads_chunk_0_0.fq.gz some chunked interleaved file of ${READS_1} ${READS_2}?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/vgteam/toil-vg/issues/349#issuecomment-341228020, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2_7rSK0_Cvy-5gWwsx7FXUOJ1IvGouks5syNFRgaJpZM4PrCxf .
@glennhickey But that still doesn't explain why the toil-vg version that has vg container version v1.5.0-1666-g0ee5778e-t107-run
fails for the same apparent reason. I'm pretty sure this version of toil-vg is much latter than sep. 18th. I think that version of toil-vg dates at least to October 23rd.
That toil-vg version is recent. It is guaranteed not to work with the vg docker image from august.
If you're reproducing this error end-to-end with up to date vg and toil-vg, please just post a more detailed issue here and i'll reproduce. If I understand, we're talking about the primary graph. So you'd need to give me the
On Wed, Nov 1, 2017 at 4:34 PM, Charles Markello notifications@github.com wrote:
@glennhickey https://github.com/glennhickey But that still doesn't explain why the toil-vg version that has vg container version v1.5.0-1666-g0ee5778e-t107-run fails for the same apparent reason. I'm pretty sure this version of toil-vg is much latter than sep. 18th.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/vgteam/toil-vg/issues/349#issuecomment-341233700, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2_7t4sCdBz4wU1IsR1c7yliR6pgydJks5syNW_gaJpZM4PrCxf .
I don't believe you can reproduce this error since I was able to run to completion a tiny test which uses the exact same parameters and program/environment versions. Also this data is housed on the NIH servers, so you can't access it.
So yes that run that uses the October 23rd version of toil-vg uses vg version v1.5.0-1557-gc2ecf7a0-t99-run
.
Would the version of .vg graphs make a difference (I thought it was the .xg formats that were changing)?
I used the same vg version v1.5.0-1557-gc2ecf7a0-t99-run
to do the gcsa and xg indexing for that run.
The command that was used to index the .vg graphs was
vg index -t 64 -x /home/markellocj/graph_flat_ref_index_vg_${VG_VERSION}/flat_ref_hg19.noMask.new.xg $(for i in $(seq 22; echo X; echo Y); do echo /home/markellocj/graph_ref_sandbox/vg_flat_ref_graph/${i}.flat_ref_hg19.noMask.vg; done)
And the commands used to generate the gcsa index was:
for chr in $(seq 1 22; echo X; echo Y);
do
vg mod -t 32 -pl 16 -S -t 16 -e 4 /home/markellocj/graph_ref_sandbox/vg/$chr.flat_ref_hg19.noMask.vg > /home/markellocj/graph_ref_sandbox/vg/$chr.flat_ref_hg19.noMask.prune.vg
vg mod -t 32 -N /home/markellocj/graph_ref_sandbox/vg/$chr.flat_ref_hg19.noMask.vg > /home/markellocj/graph_ref_sandbox/vg/$chr.flat_ref_hg19.noMask.ref.vg
cat /home/markellocj/graph_ref_sandbox/vg/$chr.flat_ref_hg19.noMask.ref.vg /home/markellocj/graph_ref_sandbox/vg/$chr.flat_ref_hg19.noMask.prune.vg | vg view -v - 2>/home/markellocj/graph_ref_sandbox/vg/$chr.flat_ref_hg19.noMask.merge.err > /home/markellocj/graph_ref_sandbox/vg/$chr.flat_ref_hg19.noMask.smooth.vg
vg kmers -gBk 16 -H 1000000000 -T 1000000001 /home/markellocj/graph_ref_sandbox/vg/$chr.flat_ref_hg19.noMask.smooth.vg >/home/markellocj/graph_ref_sandbox/vg/$chr.flat_ref_hg19.noMask.graph
done
vg index -t 56 -g /home/markellocj/graph_ref_sandbox/flat_ref_hg19.noMask.new.gcsa -Z 3000 -X 3 -k 11 $(for i in $(seq 22; echo X; echo Y); do echo -n " -i /home/markellocj/graph_ref_sandbox/vg_flat_ref_graph/$i".flat_ref_hg19.noMask.graph; done)
The mapping command is:
vg map -f reads_chunk_0_0.fq.gz -t 56 -x graph.vg.xg -g graph.vg.gcsa
And the command that's producing the error is:
vg index -a UDP10618_0.gam -d UDP10618_0.gam.index -t 6
The main error message is:
cn1809 2017-10-27 18:09:54,509 MainThread WARNING toil.leader: I/i/jobyd6AmH INFO:toil_vg.singularity:Calling singularity with ['singularity', '-q', 'exec', '-H', '/data/Udpwork/usr/markellocj/UDP10618_run/workdir_UDP10618_8_flatgraph_v1.5.0_1557_ericMap/toil-b67a0c74-366e-4ba6-8143-ee0525fad4a3-42886d05e215c3e7687fd72000000220/tmpbdfimf/d9583ba4-f8d2-4aef-ab75-80e0eb7c87e0/tl8NB9f:/home/markellocj', '--pwd', '/home/markellocj', 'docker://quay.io/vgteam/vg:v1.5.0-1557-gc2ecf7a0-t99-run', 'vg', 'index', '-a', 'UDP10618_0.gam', '-d', 'UDP10618_0.gam.index', '-t', '6']
cn1809 2017-10-27 18:09:54,509 MainThread WARNING toil.leader: I/i/jobyd6AmH ^[[33mWARNING: Bind file source does not exist on host: /etc/resolv.conf
cn1809 2017-10-27 18:09:54,509 MainThread WARNING toil.leader: I/i/jobyd6AmH ^[[0mterminate called after throwing an instance of 'std::runtime_error'
cn1809 2017-10-27 18:09:54,509 MainThread WARNING toil.leader: I/i/jobyd6AmH what(): obsolete, invalid, or corrupt protobuf input
cn1809 2017-10-27 18:09:54,509 MainThread WARNING toil.leader: I/i/jobyd6AmH Traceback (most recent call last):
cn1809 2017-10-27 18:09:54,509 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/spin1/home/linux/markellocj/toil_venv_10_16_2017_fixed_toil_vg/lib/python2.7/site-packages/toil/worker.py", line 308, in main
cn1809 2017-10-27 18:09:54,509 MainThread WARNING toil.leader: I/i/jobyd6AmH job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
cn1809 2017-10-27 18:09:54,510 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/spin1/home/linux/markellocj/toil_venv_10_16_2017_fixed_toil_vg/lib/python2.7/site-packages/toil/job.py", line 1289, in _runner
cn1809 2017-10-27 18:09:54,510 MainThread WARNING toil.leader: I/i/jobyd6AmH returnValues = self._run(jobGraph, fileStore)
cn1809 2017-10-27 18:09:54,510 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/spin1/home/linux/markellocj/toil_venv_10_16_2017_fixed_toil_vg/lib/python2.7/site-packages/toil/job.py", line 1234, in _run
cn1809 2017-10-27 18:09:54,510 MainThread WARNING toil.leader: I/i/jobyd6AmH return self.run(fileStore)
cn1809 2017-10-27 18:09:54,510 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/spin1/home/linux/markellocj/toil_venv_10_16_2017_fixed_toil_vg/lib/python2.7/site-packages/toil/job.py", line 1415, in run
cn1809 2017-10-27 18:09:54,510 MainThread WARNING toil.leader: I/i/jobyd6AmH rValue = userFunction(*((self,) + tuple(self._args)), **self._kwargs)
cn1809 2017-10-27 18:09:54,510 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/spin1/home/linux/markellocj/toil_venv_10_16_2017_fixed_toil_vg/lib/python2.7/site-packages/toil_vg/vg_map.py", line 348, in run_chunk_alignment
cn1809 2017-10-27 18:09:54,510 MainThread WARNING toil.leader: I/i/jobyd6AmH gam_chunks = split_gam_into_chroms(job, work_dir, context, xg_file, id_ranges_file, output_file)
cn1809 2017-10-27 18:09:54,510 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/spin1/home/linux/markellocj/toil_venv_10_16_2017_fixed_toil_vg/lib/python2.7/site-packages/toil_vg/vg_map.py", line 381, in split_gam_into_chroms
cn1809 2017-10-27 18:09:54,510 MainThread WARNING toil.leader: I/i/jobyd6AmH context.runner.call(job, index_cmd, work_dir = work_dir)
cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/spin1/home/linux/markellocj/toil_venv_10_16_2017_fixed_toil_vg/lib/python2.7/site-packages/toil_vg/vg_common.py", line 139, in call
cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH return self.call_with_singularity(job, args, work_dir, outfile, errfile, check_output, tool_name)
cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/spin1/home/linux/markellocj/toil_venv_10_16_2017_fixed_toil_vg/lib/python2.7/site-packages/toil_vg/vg_common.py", line 228, in call_with_singularity
cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH ret = singularityCall(job, tool, parameters=parameters, workDir=work_dir, outfile = outfile)
cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/spin1/home/linux/markellocj/toil_venv_10_16_2017_fixed_toil_vg/lib/python2.7/site-packages/toil_vg/singularity.py", line 43, in singularityCall
cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH outfile=outfile, checkOutput=False)
cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/spin1/home/linux/markellocj/toil_venv_10_16_2017_fixed_toil_vg/lib/python2.7/site-packages/toil_vg/singularity.py", line 123, in _singularity
cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH out = callMethod(call, **params)
cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/usr/local/Anaconda/envs/py2.7/lib/python2.7/subprocess.py", line 186, in check_call
cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH raise CalledProcessError(retcode, cmd)
cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH CalledProcessError: Command '['singularity', '-q', 'exec', '-H', '/data/Udpwork/usr/markellocj/UDP10618_run/workdir_UDP10618_8_flatgraph_v1.5.0_1557_ericMap/toil-b67a0c74-366e-4ba6-8143-ee0525fad4a3-42886d05e215c3e7687fd72000000220/tmpbdfimf/d9583ba4-f8d2-4aef-ab75-80e0eb7c87e0/tl8NB9f:/home/markellocj', '--pwd', '/home/markellocj', 'docker://quay.io/vgteam/vg:v1.5.0-1557-gc2ecf7a0-t99-run', 'vg', 'index', '-a', 'UDP10618_0.gam', '-d', 'UDP10618_0.gam.index', '-t', '6']' returned non-zero exit status -6
I just noticed that my tiny tests only use a single read file. @glennhickey do you have a tiny paired-end pair of reads that you test with? Nevermind, I found some. Testing it out now
So you map a smaller fastq to that same gcsa and xg and are able to index the resulting gam?
On Wed, Nov 1, 2017 at 4:56 PM, Charles Markello notifications@github.com wrote:
I don't believe you can reproduce this error since I was able to run to completion a tiny test which uses the exact same parameters and program/environment versions.
So yes that run that uses the October 23rd version of toil-vg uses vg version v1.5.0-1557-gc2ecf7a0-t99-run. Would the version of .vg graphs make a difference (I thought it was the .xg formats that were changing)? I used the same vg version v1.5.0-1557-gc2ecf7a0-t99-run to do the gcsa and xg indexing for that run. The command that was used to index the .vg graphs was vg index -t 64 -x /home/markellocj/graph_flat_ref_indexvg${VG_VERSION}/flat_ref_hg19.noMask.new.xg $(for i in $(seq 22; echo X; echo Y); do echo /home/markellocj/graphref sandbox/vg_flat_ref_graph/${i}.flat_ref_hg19.noMask.vg; done)
And the commands used to generate the gcsa index was:
for chr in $(seq 1 22; echo X; echo Y); do vg mod -t 32 -pl 16 -S -t 16 -e 4 /home/markellocj/graph_ref_sandbox/vg/$chr.flat_ref_hg19.noMask.vg > /home/markellocj/graph_ref_sandbox/vg/$chr.flat_ref_hg19.noMask.prune.vg vg mod -t 32 -N /home/markellocj/graph_ref_sandbox/vg/$chr.flat_ref_hg19.noMask.vg > /home/markellocj/graph_ref_sandbox/vg/$chr.flat_ref_hg19.noMask.ref.vg cat /home/markellocj/graph_ref_sandbox/vg/$chr.flat_ref_hg19.noMask.ref.vg /home/markellocj/graph_ref_sandbox/vg/$chr.flat_ref_hg19.noMask.prune.vg | vg view -v - 2>/home/markellocj/graph_ref_sandbox/vg/$chr.flat_ref_hg19.noMask.merge.err > /home/markellocj/graph_ref_sandbox/vg/$chr.flat_ref_hg19.noMask.smooth.vg vg kmers -gBk 16 -H 1000000000 -T 1000000001 /home/markellocj/graph_ref_sandbox/vg/$chr.flat_ref_hg19.noMask.smooth.vg >/home/markellocj/graph_ref_sandbox/vg/$chr.flat_ref_hg19.noMask.graph done
vg index -t 56 -g /home/markellocj/graph_ref_sandbox/flat_ref_hg19.noMask.new.gcsa -Z 3000 -X 3 -k 11 $(for i in $(seq 22; echo X; echo Y); do echo -n " -i /home/markellocj/graph_ref_sandbox/vg_flat_ref_graph/$i".flat_ref_hg19.noMask.graph; done)
The mapping command is: vg map -f reads_chunk_0_0.fq.gz -t 56 -x graph.vg.xg -g graph.vg.gcsa
And the command that's producing the error is: vg index -a UDP10618_0.gam -d UDP10618_0.gam.index -t 6
The main error message is:
cn1809 2017-10-27 18:09:54,509 MainThread WARNING toil.leader: I/i/jobyd6AmH INFO:toil_vg.singularity:Calling singularity with ['singularity', '-q', 'exec', '-H', '/data/Udpwork/usr/markellocj/UDP10618_run/workdir_UDP10618_8_flatgraph_v1.5.0_1557_ericMap/toil-b67a0c74-366e-4ba6-8143-ee0525fad4a3-42886d05e215c3e7687fd72000000220/tmpbdfimf/d9583ba4-f8d2-4aef-ab75-80e0eb7c87e0/tl8NB9f:/home/markellocj', '--pwd', '/home/markellocj', 'docker://quay.io/vgteam/vg:v1.5.0-1557-gc2ecf7a0-t99-run', 'vg', 'index', '-a', 'UDP10618_0.gam', '-d', 'UDP10618_0.gam.index', '-t', '6'] cn1809 2017-10-27 18:09:54,509 MainThread WARNING toil.leader: I/i/jobyd6AmH ^[[33mWARNING: Bind file source does not exist on host: /etc/resolv.conf cn1809 2017-10-27 18:09:54,509 MainThread WARNING toil.leader: I/i/jobyd6AmH ^[[0mterminate called after throwing an instance of 'std::runtime_error' cn1809 2017-10-27 18:09:54,509 MainThread WARNING toil.leader: I/i/jobyd6AmH what(): obsolete, invalid, or corrupt protobuf input cn1809 2017-10-27 18:09:54,509 MainThread WARNING toil.leader: I/i/jobyd6AmH Traceback (most recent call last): cn1809 2017-10-27 18:09:54,509 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/spin1/home/linux/markellocj/toil_venv_10_16_2017_fixed_toil_vg/lib/python2.7/site-packages/toil/worker.py", line 308, in main cn1809 2017-10-27 18:09:54,509 MainThread WARNING toil.leader: I/i/jobyd6AmH job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore) cn1809 2017-10-27 18:09:54,510 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/spin1/home/linux/markellocj/toil_venv_10_16_2017_fixed_toil_vg/lib/python2.7/site-packages/toil/job.py", line 1289, in _runner cn1809 2017-10-27 18:09:54,510 MainThread WARNING toil.leader: I/i/jobyd6AmH returnValues = self._run(jobGraph, fileStore) cn1809 2017-10-27 18:09:54,510 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/spin1/home/linux/markellocj/toil_venv_10_16_2017_fixed_toil_vg/lib/python2.7/site-packages/toil/job.py", line 1234, in _run cn1809 2017-10-27 18:09:54,510 MainThread WARNING toil.leader: I/i/jobyd6AmH return self.run(fileStore) cn1809 2017-10-27 18:09:54,510 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/spin1/home/linux/markellocj/toil_venv_10_16_2017_fixed_toil_vg/lib/python2.7/site-packages/toil/job.py", line 1415, in run cn1809 2017-10-27 18:09:54,510 MainThread WARNING toil.leader: I/i/jobyd6AmH rValue = userFunction(*((self,) + tuple(self._args)), self._kwargs) cn1809 2017-10-27 18:09:54,510 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/spin1/home/linux/markellocj/toil_venv_10_16_2017_fixed_toil_vg/lib/python2.7/site-packages/toil_vg/vg_map.py", line 348, in run_chunk_alignment cn1809 2017-10-27 18:09:54,510 MainThread WARNING toil.leader: I/i/jobyd6AmH gam_chunks = split_gam_into_chroms(job, work_dir, context, xg_file, id_ranges_file, output_file) cn1809 2017-10-27 18:09:54,510 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/spin1/home/linux/markellocj/toil_venv_10_16_2017_fixed_toil_vg/lib/python2.7/site-packages/toil_vg/vg_map.py", line 381, in split_gam_into_chroms cn1809 2017-10-27 18:09:54,510 MainThread WARNING toil.leader: I/i/jobyd6AmH context.runner.call(job, index_cmd, work_dir = work_dir) cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/spin1/home/linux/markellocj/toil_venv_10_16_2017_fixed_toil_vg/lib/python2.7/site-packages/toil_vg/vg_common.py", line 139, in call cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH return self.call_with_singularity(job, args, work_dir, outfile, errfile, check_output, tool_name) cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/spin1/home/linux/markellocj/toil_venv_10_16_2017_fixed_toil_vg/lib/python2.7/site-packages/toil_vg/vg_common.py", line 228, in call_with_singularity cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH ret = singularityCall(job, tool, parameters=parameters, workDir=work_dir, outfile = outfile) cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/spin1/home/linux/markellocj/toil_venv_10_16_2017_fixed_toil_vg/lib/python2.7/site-packages/toil_vg/singularity.py", line 43, in singularityCall cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH outfile=outfile, checkOutput=False) cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/spin1/home/linux/markellocj/toil_venv_10_16_2017_fixed_toil_vg/lib/python2.7/site-packages/toil_vg/singularity.py", line 123, in _singularity cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH out = callMethod(call, params) cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH File "/usr/local/Anaconda/envs/py2.7/lib/python2.7/subprocess.py", line 186, in check_call cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH raise CalledProcessError(retcode, cmd) cn1809 2017-10-27 18:09:54,511 MainThread WARNING toil.leader: I/i/jobyd6AmH CalledProcessError: Command '['singularity', '-q', 'exec', '-H', '/data/Udpwork/usr/markellocj/UDP10618_run/workdir_UDP10618_8_flatgraph_v1.5.0_1557_ericMap/toil-b67a0c74-366e-4ba6-8143-ee0525fad4a3-42886d05e215c3e7687fd72000000220/tmpbdfimf/d9583ba4-f8d2-4aef-ab75-80e0eb7c87e0/tl8NB9f:/home/markellocj', '--pwd', '/home/markellocj', 'docker://quay.io/vgteam/vg:v1.5.0-1557-gc2ecf7a0-t99-run', 'vg', 'index', '-a', 'UDP10618_0.gam', '-d', 'UDP10618_0.gam.index', '-t', '6']' returned non-zero exit status -6
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/vgteam/toil-vg/issues/349#issuecomment-341239976, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2_7nONullLCMajgZU1t6l0mmmKKSTLks5syNsLgaJpZM4PrCxf .
@glennhickey Yes I can now confirm that when using the same gcsa and xg index but with different fastq I am able to create indexable .gams. It however still does the mapping in single-ended mode.
So I took another look at the log file for the whole genome alignment flat-graph run that uses vg version v1.5.0-1557-gc2ecf7a0-t99-run
. It appears that a rerun of the mapping of the first fastq was made and that .gam had apparently successfully produced a gam index. It looks like this may be a sporadic fault in how the .gam file was written. @adamnovak believes this may have something to do with the version of the gpfs filesystem that the NIH biowulf cluster uses and that this version possibly is what's leading to unprotected file streams that python doesn't handle naturally. It's a problem Adam and Joel have run into before so it's a real possibility.
I'll put up another issue to run flush and then fsync once a python filestream is done being used. https://docs.python.org/2/library/os.html#os.fsync
Another problem that's still lingering is that the test run still apparently runs my mapping job in single-end mode, how can I get toil-vg to do paired-end mapping when I have a pair of fastqs?
Ok it looks like the latest version of toil-vg master (11/1/2017 as of this post) does execute paired-end mapping.
Apparently we're getting non-GAM data into the GAM input in vg index when indexing aligned reads to split into chromosome chunks.