NESCent / FossilCalibrations

Fossil calibrations database
http://fossilcalibrations.org
BSD 2-Clause "Simplified" License
14 stars 4 forks source link

Ordering of Results in Browse #39

Closed Ksepka closed 9 years ago

Ksepka commented 9 years ago

hen browsing using NCBI taxonomy, the calibrations appear in some order but I can't tell what it is - they are not in phylogenetic order, nor alphabetical, nor by ID number. Looks pretty random, especially when there are lots of results.

Proposed solution: Ideally, these would appear in some type of phylogenetic order (e.g., those assigned by us to Meazoa before those assigned to Vertebrata, before those assigned to Mammalia, before those assigned to Primates). This may be difficult and also require the addition of more clade option on the admin end. For now, maybe we should just go alphabetical? I assume that would be relatively easy.

jimallman commented 9 years ago

It looks like the current order of listed calibration is

ORDER BY HigherTaxon, NodeName, ShortName

In other words, nodes are sorted by each of these properties in turn:

So in effect, the sorting is as you described above: mainly phylogenetic by "major clades", then alphabetical. So I'm not sure where to go from here...

Ksepka commented 9 years ago

In this case, the issue may simply be that the old Higher Taxon list is in alphabetical order, so we would get Amphibia, Aves, Mammalia, Plantae, Reptilia instead of Plantae, Amphibia, Repitilia, Aves, Mammalia. I will try to confirm this but that will have to wait until the site is back up.

jimallman commented 9 years ago

In this case, the issue may simply be that the old Higher Taxon list is in alphabetical order

Yes, there's no inherent order to the rows in a SQL database, so alphabetical is the default. To assert an arbitrary order, we'll need to use another field with 'display_order' or somesuch.

Ksepka commented 9 years ago

OK, I looked closely at a series of searches today. I understand that the Higher Taxon list is the first sorter (Amphibia, Anthozoa, Arthropoda, etc.). This does create awkward ordering at the broadest scales (frogs first, then bugs, then birds, etc.) If it is possible to re-arrange the order for Higher Taxon, that would be ideal. Alphabet is fine as the second sorter. Short name is fine too as third, as this should only come into play when two authors calibrate the same node.

I feel like this is a very old Higher Taxon, but it should hold up fine for the launch. Eventually we should consider adding other options for Higher Taxon as the database grows. For example, just Plantae is no problem since we only have about a dozen plant calibrations, but we will want more options (angiosperms, gymnosperms, etc) if we find ourselves with 100 plants in the longterm.

Suggested order below (using existing fields): Life Plantae Metazoa Anthozoa Brachiopoda Mollusca Arthropoda Echinodermata Vertebrata Amphibia Reptilia Aves Mammalia Rodenta Primates

jimallman commented 9 years ago

OK, I've added a DisplayOrder column to the L_HigherTaxa table to handle arbitrary ordering. Current values are:

mysql> SELECT * FROM L_HigherTaxa ORDER BY DisplayOrder;
+-------------+---------------+------------+--------------+
| HighTaxonID | HigherTaxon   | Parent     | DisplayOrder |
+-------------+---------------+------------+--------------+
|           1 | Life          | Life       |          100 |
|           5 | Plantae       | Life       |          200 |
|           3 | Metazoa       | Life       |          300 |
|          16 | Anthozoa      | Metazoa    |          400 |
|          14 | Brachiopoda   | Metazoa    |          500 |
|          15 | Mollusca      | Metazoa    |          600 |
|          17 | Arthropoda    | Metazoa    |          700 |
|          13 | Echinodermata | Metazoa    |          800 |
|           6 | Vertebrata    | Metazoa    |          900 |
|           7 | Amphibia      | Vertebrata |         1000 |
|           8 | Reptilia      | Vertebrata |         1100 |
|           9 | Aves          | Reptilia   |         1200 |
|          10 | Mammalia      | Vertebrata |         1300 |
|          12 | Rodentia      | Mammalia   |         1400 |
|          11 | Primates      | Mammalia   |         1500 |
+-------------+---------------+------------+--------------+

Note that I've left lots of space between these values to facilitate inserting new taxa in areas like Plantae. Since we don't currently have a friendly web tool for managing this list, but this should make it a trivial operation to add and reorder higher taxa using lower-level database management tools.

jimallman commented 9 years ago

I've modified the Browse Calibrations page to use the new order as its primary sort. We also see the higher taxa listed in this order in the Edit Calibrations page, under section 2. Provide some basic information.

Ksepka commented 9 years ago

In Berlin - will look at all updates when I return, please ping me if I do not do so by mid-week.

jimallman commented 9 years ago

Will do, thanks.

Ksepka commented 9 years ago

I wanted to revisit this - I feel like the Chordate section is good, but anticipating future growth we may have gone a little sparse on Plants / Inverts. Jim A., is it easy to insert a few more or should we worry about that down the road?

jimallman commented 9 years ago

It's fairly straightforward to add and modify this list, but it's a direct operation on the database, so best handled on the production site. The table L_HigherTaxa holds this list, and the DisplayOrder column dictates the display order used, as described above.

If you can come up with a list of new records for this table, it should be easy for @dleehr to add them using the MySQL command-line tools. Note that the gaps in DisplayOrder values should allow you to add new taxa without modifying the old ones.

kcranston commented 9 years ago

Are we ok to launch with the sort order as is and worry about inserting new values down the road when we have additional data? @Ksepka

pdpolly commented 9 years ago

Ok with me (speaking from technical end). Mark Sutton, me, and possibly others are capable of dealing with additions or sorting.

Ksepka commented 9 years ago

The current order is fine for the taxon pool we have ready for launch. I envision we will need to tweak by, for example, adding more plant sorters but that won't be necessary for launch given the handful of plants we have at the moment.

On Mon, Jan 26, 2015 at 2:28 PM, P. David Polly notifications@github.com wrote:

Ok with me (speaking from technical end). Mark Sutton, me, and possibly others are capable of dealing with additions or sorting.

— Reply to this email directly or view it on GitHub https://github.com/NESCent/FossilCalibrations/issues/39#issuecomment-71521398 .

hlapp commented 9 years ago

@Ksepka - thanks for reviewing. Based on your and @pdpolly's feedback, I'll go ahead and close this.