nextstrain / fauna

RethinkDB database to support real-time virus analysis
GNU Affero General Public License v3.0
33 stars 13 forks source link

Sort out what we want to use for geo regions #64

Closed trvrb closed 6 years ago

trvrb commented 7 years ago

Current geo regions are specified here:

https://github.com/nextstrain/fauna/blob/master/source-data/geo_regions.tsv

There are 14 of them:

  1. North Africa
  2. Subsaharan Africa
  3. Europe
  4. Caribbean
  5. Central America
  6. North America
  7. China
  8. South Asia
  9. Japan / Korea
  10. South Pacific
  11. Oceania
  12. South America
  13. Southeast Asia
  14. West Asia

The current proposal is to collapse to 12 regions:

Any other suggestions?

trvrb commented 7 years ago

I've made this change to vdb/flu. I'm going to leave this issue open until vdb/zika, etc... is migrated.

trvrb commented 7 years ago

I've decided to further collapse regions to:

  1. Africa
  2. Europe
  3. North America
  4. China
  5. South Asia
  6. Japan / Korea
  7. Oceania
  8. South America
  9. Southeast Asia
  10. West Asia

I'll update fauna and vdb/flu now. This should improve flu / dengue / mumps, etc... geo inferences and untangle the global map. This is the original nextflu region map.

sidneymbell commented 7 years ago

This has become an issue for auspice, as well. While it needs to be sanity checked in auspice, it ideally shouldn't be cropping up in the data at all / should be fixed on the backend before it ever hits the front end. Details cross-posted from https://github.com/nextstrain/auspice/pull/330 (is there a better way to manage this?)

Re:

The ebola dataset is not animating, with the Bezier error occurring when I press play (stack trace printed in previous comment)

Just confirmed this is a dataset-specific bug. It's the same problem that was previously in the dengue build. I suspect this is due to the recent changes to geo metadata definitions in fauna.

I.e., this happens when two locations have a distance between them of 0, but it still tries to draw a transmission because they have different geo metadata attached to them. For example, Ebola is failing when it tries to draw the transmission between these two points: image

This then causes the value for end passed to Bezier here to be -Infinity because the denominator is 0.

This was the same bug that was previously messing with dengue. Fixed with a dataset rebuild, but this is a bandaid.

trvrb commented 6 years ago

This is resolved now.