czbiohub-sf / MIDAS

Metagenomic Intra-Species Diversity Analysis (MIDAS)
MIT License
36 stars 10 forks source link

compare with MIDAS1.0 using midas_db_v1.2 #51

Closed zhaoc1 closed 3 years ago

zhaoc1 commented 3 years ago
  1. For the genes flow, we map reads back to the centroids_99 pan-genome. And use all the centroids_99 that are single copy marker genes for normalization, not only the marker genes from representative genomes. Therefore, change the infrastructure of collate_repgenome_markers.py to compute the centroids99 to marker genes mapping for all species.
  2. midas_run_genes:
    • handle overlapping ranges in gene features before binary search
    • aligned reads should be collected before filtering
    • use centroids 99 - marker mapping