Closed bricoletc closed 5 years ago
Here is heap profile of https://github.com/iqbal-lab-org/gramtools/commit/a7db53abc75b4ceb2d347337b1d3b6ec52ebe2a8 on TB simulated dataset for build
command:
Here is the heap profile on the same dataset after removing the cache:
The red and orange bars indicate SearchState
allocation. There is a decrease in memory usage.
As a sanity check, quasimap
maps the exact same number of reads (585862 out of 1756600 in total) against the indexes built by the respective commits.
Here is the diff
output comparing the allele_sum_coverage
files produced:
104c104
< 0 0 5 0
---
> 0 0 4 0
629c629
< 0 3
---
> 0 2
934c934
< 15 18
---
> 14 20
2436,2437c2436,2437
< 1 3
< 0 4
---
> 1 4
> 0 5
2460c2460
< 21 14
---
> 21 16
3130c3130
< 0 3
---
> 0 1
3796c3796
< 0 3
---
> 0 1
3798c3798
< 1 1 1
---
> 0 0 2
3968c3968
< 4 3
---
> 2 4
4171c4171
< 0 1
---
> 0 2
4211,4212c4211,4212
< 0 0 25
< 21 1
---
> 0 0 26
> 22 0
4219,4220c4219,4220
< 0 0
< 12 9
---
> 0 2
> 6 11
4225d4224
< 0 4
4227,4228c4226,4228
< 0 4
< 0 4
---
> 0 1
> 0 3
> 0 3
4239c4239
< 17 5
---
> 15 11
4243c4243
< 0 1
---
> 0 0
4245c4245
< 0 2
---
> 0 0
4557,4558c4557,4558
< 3 31 4 6 3
< 24 8
---
> 3 32 4 6 3
> 25 7
The files are close. Identity is not expected due to random selection for multi-mapping reads.
looks interesting. i'd like to get my head around whether these diffs really are ok. see you next week!
ah....could be need cache because of this https://github.com/iqbal-lab-org/gramtools/issues/90
For the record, here is what happened:
quasimap
back-end (c++) to ensure reproducibility when mapping reads across different runs.quasimap
.RAM use reduction is ~8% on k=8.
From a7db53abc75b4ceb2d347337b1d3b6ec52ebe2a8 a
SearchState
includes a caching structure consisting of aVariantLocus
(pair of site marker and allele ID) and abool
marking if cache is populated.This does not seem necessary as traversed site/allele combinations can be greedily committed to
variant_site_path
when encountered, rather than cached and committed later (upon base extension)