churchlab / millstone

Genome engineering and analysis software
http://churchlab.github.io/millstone/
MIT License
47 stars 19 forks source link

vcf-sort and vcf-concat (part of vcftools) used in sv pipeline, but vcftools is not explicitly a requirement #398

Closed glebkuznetsov closed 9 years ago

glebkuznetsov commented 10 years ago

Either make vcftools a requirement, or replace these calls with something simpler.

dbgoodman commented 10 years ago

Do we need these due to the lumpy vcf sort problem, or were they already used elsewhere?

The poor-man's sort would just use the bash sort command:

http://stackoverflow.com/questions/357560/sorting-multiple-keys-with-unix-sort

but you'd have to pull off the header and then add it back.


Daniel B. Goodman NSF GRFP Fellow Division of Health Sciences and Technology Massachusetts Institute of Technology Bioinformatics and Integrative Genomics P: (617) 459-2949 | E: dbg@mit.edu www.dbgoodman.com

On Wed, Jul 16, 2014 at 6:12 PM, Gleb Kuznetsov notifications@github.com wrote:

Either make vcftools a requirement, or replace these calls with something simpler.

— Reply to this email directly or view it on GitHub https://github.com/churchlab/millstone/issues/398.

dbgoodman commented 10 years ago

Ah nevermind, you already did it:

https://github.com/churchlab/millstone/blob/5c7ec8cd92cfbd503880b621f02d18e6535e5287/genome_designer/pipeline/variant_calling/__init__.py#L505

So I must have used it elsewhere then?


Daniel B. Goodman NSF GRFP Fellow Division of Health Sciences and Technology Massachusetts Institute of Technology Bioinformatics and Integrative Genomics P: (617) 459-2949 | E: dbg@mit.edu www.dbgoodman.com

On Wed, Jul 16, 2014 at 6:30 PM, Daniel Bryan Goodman dbg@mit.edu wrote:

Do we need these due to the lumpy vcf sort problem, or were they already used elsewhere?

The poor-man's sort would just use the bash sort command:

http://stackoverflow.com/questions/357560/sorting-multiple-keys-with-unix-sort

but you'd have to pull off the header and then add it back.


Daniel B. Goodman NSF GRFP Fellow Division of Health Sciences and Technology Massachusetts Institute of Technology Bioinformatics and Integrative Genomics P: (617) 459-2949 | E: dbg@mit.edu www.dbgoodman.com

On Wed, Jul 16, 2014 at 6:12 PM, Gleb Kuznetsov notifications@github.com wrote:

Either make vcftools a requirement, or replace these calls with something simpler.

— Reply to this email directly or view it on GitHub https://github.com/churchlab/millstone/issues/398.

glebkuznetsov commented 10 years ago

Yes I did a poor man's sort: https://github.com/churchlab/millstone/commit/5c7ec8cd92cfbd503880b621f02d18e6535e5287

But then I noticed that we are using vcf-sort here: https://github.com/churchlab/millstone/blob/master/genome_designer/pipeline/variant_calling/__init__.py#L350

On Wed, Jul 16, 2014 at 6:31 PM, Daniel Bryan Goodman < notifications@github.com> wrote:

Do we need these due to the lumpy vcf sort problem, or were they already used elsewhere?

The poor-man's sort would just use the bash sort command:

http://stackoverflow.com/questions/357560/sorting-multiple-keys-with-unix-sort

but you'd have to pull off the header and then add it back.


Daniel B. Goodman NSF GRFP Fellow Division of Health Sciences and Technology Massachusetts Institute of Technology Bioinformatics and Integrative Genomics P: (617) 459-2949 | E: dbg@mit.edu www.dbgoodman.com

On Wed, Jul 16, 2014 at 6:12 PM, Gleb Kuznetsov notifications@github.com

wrote:

Either make vcftools a requirement, or replace these calls with something simpler.

— Reply to this email directly or view it on GitHub https://github.com/churchlab/millstone/issues/398.

— Reply to this email directly or view it on GitHub https://github.com/churchlab/millstone/issues/398#issuecomment-49236684.

dbgoodman commented 10 years ago

Ah ok. If it's just delly, then maybe we can ditch it if we go the lumpy-only route.


Daniel B. Goodman NSF GRFP Fellow Division of Health Sciences and Technology Massachusetts Institute of Technology Bioinformatics and Integrative Genomics P: (617) 459-2949 | E: dbg@mit.edu www.dbgoodman.com

On Wed, Jul 16, 2014 at 6:33 PM, Gleb Kuznetsov notifications@github.com wrote:

Yes I did a poor man's sort:

https://github.com/churchlab/millstone/commit/5c7ec8cd92cfbd503880b621f02d18e6535e5287

But then I noticed that we are using vcf-sort here:

https://github.com/churchlab/millstone/blob/master/genome_designer/pipeline/variant_calling/__init__.py#L350

On Wed, Jul 16, 2014 at 6:31 PM, Daniel Bryan Goodman < notifications@github.com> wrote:

Do we need these due to the lumpy vcf sort problem, or were they already used elsewhere?

The poor-man's sort would just use the bash sort command:

http://stackoverflow.com/questions/357560/sorting-multiple-keys-with-unix-sort

but you'd have to pull off the header and then add it back.


Daniel B. Goodman NSF GRFP Fellow Division of Health Sciences and Technology Massachusetts Institute of Technology Bioinformatics and Integrative Genomics P: (617) 459-2949 | E: dbg@mit.edu www.dbgoodman.com

On Wed, Jul 16, 2014 at 6:12 PM, Gleb Kuznetsov < notifications@github.com>

wrote:

Either make vcftools a requirement, or replace these calls with something simpler.

— Reply to this email directly or view it on GitHub https://github.com/churchlab/millstone/issues/398.

— Reply to this email directly or view it on GitHub https://github.com/churchlab/millstone/issues/398#issuecomment-49236684.

— Reply to this email directly or view it on GitHub https://github.com/churchlab/millstone/issues/398#issuecomment-49236881.

glebkuznetsov commented 9 years ago

Probably not using delly.