OpenTreeOfLife / germinator

miscellaneous scripts and data for concerns that span more than one of the Open Tree code repositories: integration tests, system statistics, etc.
BSD 2-Clause "Simplified" License
21 stars 7 forks source link

Treatment of taxonomy in tree_of_life method results #70

Closed jar398 closed 8 years ago

jar398 commented 8 years ago

Currently tm-lite is returning this:

# conflict/support properties have:
"supported_by" : {
  "taxonomy" : "2.9draft12"
},
# source_id_map has:
  "taxonomy" : {
    "name" : "ott",
    "version" : "2.9draft12"
  },

A few problems with this:

Proposal:

# conflict/support properties have:
"supported_by" : {
  "ott2.9draft12" : "ott12345"
},
# source_id_map has:
  "ott2.9draft12" : {
    "taxonomy" : "ott2.9draft12"
  },

Explanation:

The only information tm-lite receives from propinquity about the taxonomy is something like this:

"taxonomy_version": "ott2.9draft12"

Currently tm-lite "taxonomy_version" to get the "name" and "version" properties, which it then passes on to clients. Maybe tm-lite shouldn't be in the business of parsing these things and the job should be left up to clients - but tell me if I'm wrong. This proposal simply passes the value of that field on to API clients.

mtholder commented 8 years ago

I agree with the idea of making the value in supported by a node identifier, and the "ott12345" idea seems like a fine way to go for the reasons that @jar398 gave.

I don't have a strong feeling about whether the key is called "taxonomy" or "ott2.9draft12". @jimallman's code needs to be able to tell whether the support is from taxonomy or not. But it can do this by using the key. or the presence of a taxonomy property in the entry in source_id_map These 2 implementation choices seem almost identical to me, so I don't really care which we use. In both cases the taxonomy is an exception (either because it has a special key, or because its source_id_map entry has a taxonomy property). If we want to support multiple taxonomies, that would argue for the second option. But I was not aware that we are planning for that.

josephwb commented 8 years ago

The proposal

# conflict/support properties have:
"supported_by" : {
  "ott2.9draft12" : "ott12345"
},
# source_id_map has:
  "ott2.9draft12" : {
    "taxonomy" : "ott2.9draft12"
  },

looks fine to me, fwiw.

kcranston commented 8 years ago

I am also ok with the proposal.

josephwb commented 8 years ago

Ok, I will try out this proposal on test. Will be up in ~10 minutes.

josephwb commented 8 years ago

This is what a result looks like:

curl -X POST http://localhost:7474/db/data/ext/tree_of_life_v3/graphdb/node_info -H "content-type:application/json" -d '{"ott_id":81461}'
{
  "partial_path_of" : {
    "pg_2573@tree5959" : "node1011373"
  },
  "supported_by" : {
    "2.9draft12" : "ott81461"
  },
  "source_id_map" : {
    "2.9draft12" : {
      "taxonomy" : "2.9draft12"
    },
    "pg_2573@tree5959" : {
      "git_sha" : "e163a9245c15aefbba79defb685df98d5a73cab3",
      "tree_id" : "tree5959",
      "study_id" : "pg_2573"
    },
    "pg_2822@tree6569" : {
      "git_sha" : "e163a9245c15aefbba79defb685df98d5a73cab3",
      "tree_id" : "tree6569",
      "study_id" : "pg_2822"
    }
  },
  "taxon" : {
    "tax_sources" : [ "ncbi:8782", "worms:1836", "gbif:212", "irmng:1142" ],
    "unique_name" : "Aves",
    "name" : "Aves",
    "rank" : "class",
    "ott_id" : 81461
  },
  "num_tips" : 14302,
  "terminal" : {
    "pg_2822@tree6569" : "ott240568"
  },
  "node_id" : "ott81461"
}
josephwb commented 8 years ago

DB deployed on test.

curl -X POST https://test.opentreeoflife.org/v3/tree_of_life/node_info -H "content-type:application/json" -d '{"ott_id":81461}'
jar398 commented 8 years ago

I was mistaken, propinquity has taxonomy_version: 2.9draft5, not taxonomy_version: ott2.9draft5. That yields a non-ncname. So somewhere along the line 'ott' will need to be prefixed to the key. Then whether the "taxonomy" property is ott2.9draft5 or not I'm not sure; it could be "name"

(eventually "ott" or some other taxonomy identification will have be passed along from propinquity to tm-lite; it would be provincial for the code to assume it's only going to be used with OTT. but we can fix that later, the API spec is the important question now.)

josephwb commented 8 years ago

I think a0076432161ff09e89d9016dfa5f36231011cba4 @josephwb josephwb committed 2 minutes ago satisfies things. Deployed on test:

curl -X POST https://test.opentreeoflife.org/v3/tree_of_life/node_info -H "content-type:application/json" -d '{"ott_id":81461}'
{
  "partial_path_of" : {
    "pg_2573@tree5959" : "node1011373"
  },
  "supported_by" : {
    "ott2.9draft12" : "ott81461"
  },
  "source_id_map" : {
    "ott2.9draft12" : {
      "taxonomy" : "ott2.9draft12"
    },
    "pg_2573@tree5959" : {
      "git_sha" : "e163a9245c15aefbba79defb685df98d5a73cab3",
      "tree_id" : "tree5959",
      "study_id" : "pg_2573"
    },
    "pg_2822@tree6569" : {
      "git_sha" : "e163a9245c15aefbba79defb685df98d5a73cab3",
      "tree_id" : "tree6569",
      "study_id" : "pg_2822"
    }
  },
  "taxon" : {
    "tax_sources" : [ "ncbi:8782", "worms:1836", "gbif:212", "irmng:1142" ],
    "unique_name" : "Aves",
    "name" : "Aves",
    "rank" : "class",
    "ott_id" : 81461
  },
  "num_tips" : 14302,
  "terminal" : {
    "pg_2822@tree6569" : "ott240568"
  },
  "node_id" : "ott81461"
}
jar398 commented 8 years ago

Looks great to me.

I will wait a little while before merging to v3 because copying over the db is a pain and who knows, we may have to do it again soon.