tanghaibao / quota-alignment

Guided synteny alignment between duplicated genomes (within specified quota constraint)
54 stars 21 forks source link

add a new option to qa_plot.py and a new script chr_remover.py #10

Closed zengxiaofei closed 7 years ago

zengxiaofei commented 7 years ago

a new option --remove:

In many cases, the scaffolds and contigs are not assembled to pseudo-molecules. The chromosome breaks and labels of scaffolds and contigs will be stacked up. Here, I add a new option --remove, to remove the chromosome breaks and labels from the query genome (--remove=q) or from the subject genome (--remove=s) or from both of them (--remove=qs).

I think the option is very useful and make the script more flexible :)

chr_remover.py:

This script aims to remove the specified chromosome(s) in bed and pep file. It can remove a chromosome or a set of chromosomes by using a regular expression.

For example: There are chromosomes named "unknown" and "_random" in the grape genome (the dotplot example in README.rst). Sometimes we don't need them. This script can remove them from bed and pep file. If I want to remove all "_random" chromosomes, just do this:

python chr_remover.py Vvi.bed Vvi.pep random

If I want to remove chr1 but not remove chr10 - chr19, just do this:

python chr_remover.py Vvi.bed Vvi.pep chr1$

This script uses argparse, if you don't like it, I can rewrite it using optparse

README.rst:

  1. argparse needs python2.7
  2. the url of test data set is invalid