ComparativeGenomicsToolkit / cactus

Official home of genome aligner based upon notion of Cactus graphs
Other
523 stars 111 forks source link

Make input sorting optional in cactus-maf2bigmaf #1030

Closed glennhickey closed 1 year ago

glennhickey commented 1 year ago

cactus-maf2bigmaf follows the BigMaf example which sorts the BED version of the maf by (contig, startpos) with unix sort.

Sorting this way can be fairly expensive of huge inputs and, as far as I can tell, unnecessary because cactus-hal2maf always outputs MAFs in sorted order.

So this PR just puts the sorting behind an option (--sort) which shouldn't be needed in the typical use case where the input is coming from cactus-hal2maf.

glennhickey commented 1 year ago

Oops

RuntimeError: Command /usr/bin/time -v bedToBigBed -type=bed3+1 -as=/nanopore/cgl/glenn-scratch/new-maf/work/001e201147d25aada1152022c2e39d5e/83ca/e31a/tmpfe45yg9u/bigMaf.as -tab /nanopore/cgl/glenn-scratch/new-maf/work/001e201147d25aada1152022c2e39d5e/83ca/e31a/tmpfe45yg9u/bigMaf.txt /nanopore/cgl/glenn-scratch/new-maf/work/001e201147d25aada1152022c2e39d5e/83ca/e31a/tmpfe45yg9u/hg38.chrom_sizes /nanopore/cgl/glenn-scratch/new-maf/work/001e201147d25aada1152022c2e39d5e/83ca/e31a/tmpfe45yg9u/447-mammalian-2022v1.bigmaf.bb exited 255: stdout=None, stderr=/nanopore/cgl/glenn-scratch/new-maf/work/001e201147d25aada1152022c2e39d5e/83ca/e31a/tmpfe45yg9u/bigMaf.txt is not case-sensitive sorted at line 107943748.  Please use "sort -k1,1 -k2,2n" with LC_COLLATE=C,  or bedSort and try again.
        Command exited with non-zero status 255
                Command being timed: "bedToBigBed -type=bed3+1 -as=/nanopore/cgl/glenn-scratch/new-maf/work/001e201147d25aada1152022c2e39d5e/83ca/e31a/tmpfe45yg9u/bigMaf.as -tab /nanopore/cgl/glenn-scratch/new-maf/work/001e201147d25aada1152022c2e39d5e/83ca/e31a/tmpfe45yg9u/bigMaf.txt /nanopore/cgl/glenn-scratch/new-maf/work/001e201147d25aada1152022c2e39d5e/83ca/e31a/tmpfe45yg9u/hg38.chrom_sizes /nanopore/cgl/glenn-scratch/new-maf/work/001e201147d25aada1152022c2e39d5e/83ca/e31a/tmpfe45yg9u/447-mammalian-2022v1.bigmaf.bb"