iqbal-lab-org / gramtools

Genome inference from a population reference graph
MIT License
92 stars 15 forks source link

Steps for generating SA intervals for kmers #57

Closed ffranr closed 7 years ago

ffranr commented 7 years ago

0) Test latest kmer generating code i) Does it generate all kmers like the old code does? ii) Does it generate the variant overlapping kmers correctly? (unit tests pass + small-ish manual test + simulated reads that all map (on wg and toy prg))

1) For WG prg, generate all kmers with old variantKmers -n (on EBI cluster, splitting into many jobs wit independent subfiles of kmers) and generate precalc files with baseline commit. Add a commit on baseline tro use 1 thread

2) Prioritize kmer sizes 15 and 17 when generating precalc files. Run all from 2 to 20.

3) As one but for variant overlapping kemrs only. 150 kmers region size.

Parameters for cluster: 1 threads, cluster should report overall RAM usage should happen automatically.

Baseline commit: https://github.com/iqbal-lab-org/gramtools/tree/a510cbd699f37cb7c4561a1d201d549dfad97b91

iqbal-lab commented 7 years ago

For step 1 (and 3) - Sorina has confirmed that she used gramtools.py --quasimap to generate the precalc file

ffranr commented 7 years ago

I think that this work has been completed.