Background:
I ran into an issue recently when writing BigWig files from python. Since bedtools sort assumes lexographic (chr1, chr11...) sort order, this conflicts with the bigwig header, and bigwigs also require input to be sorted. This can be worked around, of course, but I thought I'd try to get the pybedtools genome args machinery to work for this case.
Changes:
add uses_genome argument to the BedTools sort wrapper, with genome_if specified so that it still runs without any arguments (with the same behaviour).
remove sorting from chromsizes_to_file function.
Notes:
I've had to change the chromsizes_to_file helper function to remove sorting, since this is imposing lexographic sort order to any given chromsizes (and also to downloaded genomes), as it was it was impossible to specify the correct sort order without manually (ie without help from pybedtools) creating the genome file. I can't think why this would cause problems in general, but it would mean the output of the tool would change in some circumstances.
I've added a quick test (which passes in isolation), however I couldn't the automatic tests to run on my machine (ModuleNotFoundError: No module named 'pybedtools.cbedtools').
Thanks for looking at this, happy to file a related issue or alter this PR as requested.
Sorry it's been so long, but returning to this now I think it's a good change. I'll try merging it into the v0.9.1 branch to see what the other tests think...
Background: I ran into an issue recently when writing BigWig files from python. Since
bedtools sort
assumes lexographic (chr1, chr11...) sort order, this conflicts with the bigwig header, and bigwigs also require input to be sorted. This can be worked around, of course, but I thought I'd try to get the pybedtools genome args machinery to work for this case.Changes:
uses_genome
argument to the BedTools sort wrapper, withgenome_if
specified so that it still runs without any arguments (with the same behaviour).chromsizes_to_file
function.Notes: I've had to change the
chromsizes_to_file
helper function to remove sorting, since this is imposing lexographic sort order to any given chromsizes (and also to downloaded genomes), as it was it was impossible to specify the correct sort order without manually (ie without help from pybedtools) creating the genome file. I can't think why this would cause problems in general, but it would mean the output of the tool would change in some circumstances.I've added a quick test (which passes in isolation), however I couldn't the automatic tests to run on my machine (
ModuleNotFoundError: No module named 'pybedtools.cbedtools'
).Thanks for looking at this, happy to file a related issue or alter this PR as requested.