lutteropp / hakmer-ng-redesign

0 stars 0 forks source link

Sample taxa more evenly #62

Closed lutteropp closed 5 years ago

lutteropp commented 5 years ago

We sometimes get something like this, for example in the m2 dataset:

Hypothetical best-case taxon usages if we wouldn't care about overlaps and stuff like this: SE001: 65.8872 % SE002: 74.6037 % SE003: 580.259 % SE004: 149.343 % SE005: 286.655 % SE006: 534.967 % SE007: 629.693 % SE008: 274.705 % SE009: 861.429 % SE010: 530.466 % SE011: 406.939 % SE012: 817.361 % SE013: 665.242 % SE014: 711.325 % SE015: 735.772 % SE016: 512.475 % SE017: 722.159 % SE018: 853.066 % SE019: 839.111 % SE020: 64.1098 %

Percentage of reconstructed sequence data per taxon: SE001: 8.1907 % SE002: 11.0295 % SE003: 76.9159 % SE004: 27.1511 % SE005: 51.8601 % SE006: 74.3381 % SE007: 81.0818 % SE008: 50.1152 % SE009: 91.2277 % SE010: 74.3992 % SE011: 63.8208 % SE012: 87.8972 % SE013: 81.9521 % SE014: 84.4996 % SE015: 84.9001 % SE016: 74.0937 % SE017: 85.4795 % SE018: 90.9021 % SE019: 90.408 % SE020: 7.72766 %

lutteropp commented 5 years ago

This means, some taxa appear in way more blocks than others!

lutteropp commented 5 years ago

closed because we probably can't do much about this