OpenTreeOfLife / taxomachine

taxonomy graphdb
Other
7 stars 4 forks source link

Display distinguishing features when there are homonyms #28

Closed jar398 closed 10 years ago

jar398 commented 10 years ago

E.g. in the opentree webapp if I enter 'Rhodotorula', all I see is 'Rhodotorula (Fungi)' repeated eleven times. They are different taxa but I have no way to tell which is which. Some distinguishing characteristic of each should be transmitted to the webapp so that the options can be displayed differently from one another. Worst but possible option: the webapp could show the OTT id. Not very helpful but better than nothing. Mediocre: show NCBI or GBIF id Better: show a distinguishing ancestor taxon (e.g. use the 'uniquename' field in the taxonomy, or one computed by taxomachine)

chinchliff commented 10 years ago

@jar398 @jimallman This change has been made and should appear automatically in the results of the autocomplete box (the "name" field in the results will contain the "uniquename" from the taxonomy if it is not empty) . Please confirm and close this issue if so.

jar398 commented 10 years ago

Not done. For example try "Neoptera". It shows "Neoptera (Insecta)" and "Neoptera (Insecta)". For a particularly perverse example try "Endoxyla".

chinchliff commented 10 years ago

That appears to be the old format. Now it should never show that format. It should be returning the uniqname if there is one, otherwise it just returns the name. Maybe the webservice you are calling is using an older taxomachine plugin? I am fairly sure I pushed those changes but I will look into this to verify.

On Monday, April 7, 2014, Jonathan A Rees notifications@github.com wrote:

Not done. For example try "Neoptera". It shows "Neoptera (Insecta)" and "Neoptera (Insecta)". For a particularly perverse example try "Endoxyla".

Reply to this email directly or view it on GitHubhttps://github.com/OpenTreeOfLife/taxomachine/issues/28#issuecomment-39745231 .

chinchliff commented 10 years ago

@jar398 where were you calling the service when you got the old format for the unique name?

jar398 commented 10 years ago

See the search box at http://dev.opentreeoflife.org/opentree/argus/otol.draft.22@215 .

On Wed, Apr 16, 2014 at 10:42 AM, Cody Hinchliff notifications@github.comwrote:

@jar398 https://github.com/jar398 where were you calling the service when you got the old format for the unique name?

— Reply to this email directly or view it on GitHubhttps://github.com/OpenTreeOfLife/taxomachine/issues/28#issuecomment-40606941 .

jimallman commented 10 years ago

The search box on dev is calling ot9, perhaps using a deprecated method? Here's the search for Endoxyla as a simplified curl request and the full response:

curl 'http://ec2-54-203-194-13.us-west-2.compute.amazonaws.com/taxomachine/ext/TNRS/graphdb/autocompleteBoxQuery' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-binary '{"queryString":"Endoxyla","contextName":"All life"}'
[ {
  "ottId" : 429888,
  "nodeId" : 149744,
  "exact" : false,
  "name" : "Endoxyla (Fungi)",
  "higher" : true
}, {
  "ottId" : 4054899,
  "nodeId" : 149672,
  "exact" : false,
  "name" : "Endoxyla (Fungi)",
  "higher" : true
}, {
  "ottId" : 1098009,
  "nodeId" : 1590418,
  "exact" : false,
  "name" : "Endoxyla (Insects)",
  "higher" : true
} ]
jimallman commented 10 years ago

Just confirmed: Cody's fix #30 is in place on ot9. Checking the called method now, in case I've got it wrong...

jimallman commented 10 years ago

It seems like we're using the new code -- at least I can follow the calls from autocompleteBoxQuery to getUniqueName -- but the results don't reflect this. Is it possible the neo4j data is old or incomplete, without the expected uniqname properties?

String uniqueName = (String) getMatchedNode().getProperty("uniqname");
if (uniqueName.length() > 0) {
    return uniqueName;
} else {
    return (String) getMatchedNode().getProperty("name");
}
chinchliff commented 10 years ago

If the service were using the updated code shown by Jim, then I don't see how it would be returning higher taxon names in parentheses. That is the old behavior (there was code that would get the parent and return its name in parentheses). Is ot9 actually calling taxomachine services on ot9, or are the calls pointing somewhere else?

On Wed, Apr 16, 2014 at 1:13 PM, Jim Allman notifications@github.comwrote:

It seems like we're using the new code -- at least I can follow the calls from autocompleteBoxQuery to getUniqueName -- but the results don't reflect this. Is it possible the neo4j data is old or incomplete, without the expected uniqname properties?

String uniqueName = (String) getMatchedNode().getProperty("uniqname");if (uniqueName.length() > 0) { return uniqueName;} else { return (String) getMatchedNode().getProperty("name");}

— Reply to this email directly or view it on GitHubhttps://github.com/OpenTreeOfLife/taxomachine/issues/28#issuecomment-40625748 .

jar398 commented 10 years ago

ot9 would only be updated to a new version of the taxomachine repo if someone did it. Do we know which commit fixed the problem? Jonathan

ot9 Linux ip-10-227-56-121 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1 x86_64

The programs included with the Debian GNU/Linux system are free software; the exact distribution terms for each program are described in the individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law. Last login: Wed Apr 16 16:47:19 2014 from 184.3.219.202 (venv)opentree@ip-10-227-56-121:~$ cd repo/taxomachine (venv)opentree@ip-10-227-56-121:~/repo/taxomachine$ git log commit 9917b43d9d46a29422823e31c87f11e674ba9fae Merge: ef82b75 8872f70 Author: chinchliff cody.hinchliff@gmail.com Date: Fri Mar 28 21:28:32 2014 -0400

Merge pull request #32 from OpenTreeOfLife/gpl

Add GPL

commit 8872f70eee44ca5050939a5c480594f54b0dfe3a Author: Jonathan A Rees jar398@mumble.net Date: Fri Mar 28 20:57:39 2014 -0400

Fix LICENSE

commit 8bf2f7dd1abb2cedd690b0e0aa1f8ab5996f3e1c Author: Jonathan A Rees jar398@mumble.net Date: Fri Mar 28 17:29:48 2014 -0400

Add GPL

commit ef82b753a73cf99b8d762bbe2a04d13d2e9fbd56 Merge: 2462d23 f583b7f

On Wed, Apr 16, 2014 at 3:47 PM, Cody Hinchliff notifications@github.comwrote:

If the service were using the updated code shown by Jim, then I don't see how it would be returning higher taxon names in parentheses. That is the old behavior (there was code that would get the parent and return its name in parentheses). Is ot9 actually calling taxomachine services on ot9, or are they called from somewhere else?

On Wed, Apr 16, 2014 at 1:13 PM, Jim Allman notifications@github.comwrote:

It seems like we're using the new code -- at least I can follow the calls from autocompleteBoxQuery to getUniqueName -- but the results don't reflect this. Is it possible the neo4j data is old or incomplete, without the expected uniqname properties?

String uniqueName = (String) getMatchedNode().getProperty("uniqname");if (uniqueName.length() > 0) { return uniqueName;} else { return (String) getMatchedNode().getProperty("name");}

— Reply to this email directly or view it on GitHub< https://github.com/OpenTreeOfLife/taxomachine/issues/28#issuecomment-40625748>

.

— Reply to this email directly or view it on GitHubhttps://github.com/OpenTreeOfLife/taxomachine/issues/28#issuecomment-40643473 .

chinchliff commented 10 years ago

Should be ​this one:

https://github.com/OpenTreeOfLife/taxomachine/commit/2c69f810ff4048db88999368431f57849bca48d6

jimallman commented 10 years ago

Yes @chinchliff, I thought the same. And this commit is definitely in the log on ot9. Any chance it didn't compile and is not actually running? I'll push again in a moment, just to be sure.

jimallman commented 10 years ago

I pushed default components to ot9, including both treemachine and taxomachine. Treemachine needed a minor bump, but taxomachine was "already up to date".

Result: No change in the autocomplete results for Endoxyla.

jimallman commented 10 years ago

Update: I pushed the taxomachine repo on ot9 back by a revision via git reset HEAD~, just to force a rebuild of the taxomachine plugins. No change to the results for Endoxyla. The messages from Maven suggest that most/all classes were already up to date, so maybe I need to do something more drastic to be absolutely sure it recompiles...?

jimallman commented 10 years ago

Is ot9 actually calling taxomachine services on ot9, or are the calls pointing somewhere else?

The calls are coming from the main webapp on ot3. I see the autocomplete requests in the neo4j logs on ot9:

# ssh admin@0t9
$ cd /home/opentree/neo4j-taxomachine/data/log
$ tail -f console.log http.log
==> http.log <==
127.0.0.1 - - [16/Apr/2014:21:53:23 +0000] "POST /db/data/ext/TNRS/graphdb/getContextForNames HTTP/1.1" 200 682 "-" "Java/1.7.0_25"
127.0.0.1 - - [16/Apr/2014:21:53:24 +0000] "OPTIONS /db/data/ext/TNRS/graphdb/autocompleteBoxQuery HTTP/1.1" 204 0 "http://dev.opentreeoflife.org/opentree/argus/otol.draft.22@215" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.152 Safari/537.36"
127.0.0.1 - - [16/Apr/2014:21:53:24 +0000] "POST /db/data/ext/TNRS/graphdb/autocompleteBoxQuery HTTP/1.1" 200 458 "http://dev.opentreeoflife.org/opentree/argus/otol.draft.22@215" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.152 Safari/537.36"

I assume this is using its own (local) neo4j database, yes? Could that data be "stale"?

jar398 commented 10 years ago

Well I'm baffled... I looked at TNRSMatchSet.java on the server, verified that it had the right code in it, stopped neo4j, recompiled using mvn_serverplugins.sh, started neo4j... and still no change. Doesn't make any sense. Made sure the config file points to ot9.

On Wed, Apr 16, 2014 at 5:41 PM, Jim Allman notifications@github.comwrote:

I pushed default components to ot9, including both treemachine and taxomachine. Treemachine needed a minor bump, but taxomachine was "already up to date".

Result: No change in the autocomplete results for Endoxyla.

— Reply to this email directly or view it on GitHubhttps://github.com/OpenTreeOfLife/taxomachine/issues/28#issuecomment-40655999 .

mtholder commented 10 years ago

Just a wild guess here; is the behavior here determined by the version of the taxomachine code that was used when the taxonomy is read in to create the graph db (rather than the version of the code used to add web services to an existing db)?

chinchliff commented 10 years ago

I also uploaded a newly built taxonomy database using ott2.6 to ot9 today...

On Wed, Apr 16, 2014 at 6:18 PM, Mark T. Holder notifications@github.comwrote:

Just a wild guess here; is the behavior here determined by the version of the taxomachine code that was used when the taxonomy is read in to create the graph db (rather than the version of the code used to add web services to an existing db)?

— Reply to this email directly or view it on GitHubhttps://github.com/OpenTreeOfLife/taxomachine/issues/28#issuecomment-40659603 .

chinchliff commented 10 years ago

Could it be that the taxomachine plugin is somehow not getting replaced by the push script? I had done some ugly things with file permissions a few weeks ago, maybe some of those are still lingering?

Here is an example of the expected behavior. I just tested this again on my local machine using ott 2.6 and the current taxomachine master branch code.

cody@chromium:~/phylo/taxomachine$ curl -X POST http://localhost:7474/db/data/ext/TNRS/graphdb/autocompleteBoxQuery -H "content-type:application/json" -d '{"queryString":"Endoxyla","contextName":"All life"}' [ { "ottId" : 1098009, "nodeId" : 1590418, "exact" : false, "name" : "Endoxyla (genus in family Cossidae)", "higher" : true }, { "ottId" : 4054899, "nodeId" : 149672, "exact" : false, "name" : "Endoxyla (if:1813,ncbi:1289957,gbif:2572474)", "higher" : true }, { "ottId" : 429888, "nodeId" : 149744, "exact" : false, "name" : "Endoxyla (ncbi:255778)", "higher" : true } ]

On Wed, Apr 16, 2014 at 7:33 PM, Cody Hinchliff cody.hinchliff@gmail.comwrote:

I also uploaded a newly built taxonomy database using ott2.6 to ot9 today...

On Wed, Apr 16, 2014 at 6:18 PM, Mark T. Holder notifications@github.comwrote:

Just a wild guess here; is the behavior here determined by the version of the taxomachine code that was used when the taxonomy is read in to create the graph db (rather than the version of the code used to add web services to an existing db)?

— Reply to this email directly or view it on GitHubhttps://github.com/OpenTreeOfLife/taxomachine/issues/28#issuecomment-40659603 .

jimallman commented 10 years ago

Could it be that the taxomachine plugin is somehow not getting replaced by the push script? I had done some ugly things with file permissions a few weeks ago, maybe some of those are still lingering?

I suppose it's possible. As I said, Maven reported that all classes were up to date, so it bailed on the final plugin compilation. At least that's my understanding. Here's the output from attempting another push to taxomachine: https://gist.github.com/jimallman/10946715

And here's the output when I force the taxomachine on ot9 back to a previous version, then push again (once again, it stops compiling because classes are "up to date"): https://gist.github.com/jimallman/10946897

And one more time, after forcing it back to before the commit in question and clearing old .class files from /home/opentree/repo/taxomachine/target/classes/org/opentree/tnrs: https://gist.github.com/jimallman/10947236

jimallman commented 10 years ago

Note too the "sanity check" on the final .jar file in that last gist. As far as I can tell, we're running the one that just built, from the latest taxomachine code.

Hm... @jar398, are we caching these autocomplete results? Is it possible the code is right, but we're showing (really) old results?

UPDATE: Scratch that, I forgot these are POST requests.

jar398 commented 10 years ago

On Wed, Apr 16, 2014 at 9:38 PM, Jim Allman notifications@github.comwrote:

Could it be that the taxomachine plugin is somehow not getting replaced by the push script? I had done some ugly things with file permissions a few weeks ago, maybe some of those are still lingering?

I suppose it's possible. As I said, Maven reported that all classes were up to date, so it bailed on the final plugin compilation. At least that's my understanding.

Oh, I think I see the problem. I forced recompilation on ot9 by doing ./mvn_serviplugins.sh , but didn't copy the .jar file to the plugins directory.

Probably the setup script should be altered so that it creates a symbolic link, rather than a copy.

I don't understand how it got into this state, however. If the repo doesn't change, the .jar file doesn't need to be copied. If it does, the script that recompiles also copies the .jar file.

Oh, I get it! Taxomachine gets updated in two different places in two different ways - once as a library for the maven repository and once as a neo4j plugin. I don't know why. In any case updating for the first way causes the check for up to dateness in the second way to succeed when it ought to fail.

This will require a rewrite if we really need both forms. Cody, I think you added this to the script:

if git_refresh OpenTreeOfLife taxomachine || [ ! -d ~/.m2/repository/org/opentree/taxomachine ]; then (cd repo/taxomachine; sh mvn_install.sh) fi

Is this really necessary? What uses taxomachine as a library? If it's not necessary the fix is trivial, we just delete this command. Otherwise a more significant rewrite will be needed so as to combine the two taxomachine builds into a single step.

Jonathan

Here's the output from attempting another push to taxomachine: https://gist.github.com/jimallman/10946715

And here's the output when I force the taxomachine on ot9 back to a previous version, then push again (once again, it stops compiling because classes are "up to date"): https://gist.github.com/jimallman/10946897

And one more time, after forcing it back to before the commit in question and clearing old .class files from /home/opentree/repo/taxomachine/target/classes/org/opentree/tnrs: https://gist.github.com/jimallman/10947236

— Reply to this email directly or view it on GitHubhttps://github.com/OpenTreeOfLife/taxomachine/issues/28#issuecomment-40672197 .

jar398 commented 10 years ago

In any case it's working now, but the script will have to be fixed before any more changes to the taxomachine plugin can be mechanically deployed. So ths issue can be closed.

On Wed, Apr 16, 2014 at 9:58 PM, Jonathan A Rees rees@mumble.net wrote:

On Wed, Apr 16, 2014 at 9:38 PM, Jim Allman notifications@github.comwrote:

Could it be that the taxomachine plugin is somehow not getting replaced by the push script? I had done some ugly things with file permissions a few weeks ago, maybe some of those are still lingering?

I suppose it's possible. As I said, Maven reported that all classes were up to date, so it bailed on the final plugin compilation. At least that's my understanding.

Oh, I think I see the problem. I forced recompilation on ot9 by doing ./mvn_serviplugins.sh , but didn't copy the .jar file to the plugins directory.

Probably the setup script should be altered so that it creates a symbolic link, rather than a copy.

I don't understand how it got into this state, however. If the repo doesn't change, the .jar file doesn't need to be copied. If it does, the script that recompiles also copies the .jar file.

Oh, I get it! Taxomachine gets updated in two different places in two different ways - once as a library for the maven repository and once as a neo4j plugin. I don't know why. In any case updating for the first way causes the check for up to dateness in the second way to succeed when it ought to fail.

This will require a rewrite if we really need both forms. Cody, I think you added this to the script:

if git_refresh OpenTreeOfLife taxomachine || [ ! -d ~/.m2/repository/org/opentree/taxomachine ]; then (cd repo/taxomachine; sh mvn_install.sh) fi

Is this really necessary? What uses taxomachine as a library? If it's not necessary the fix is trivial, we just delete this command. Otherwise a more significant rewrite will be needed so as to combine the two taxomachine builds into a single step.

Jonathan

Here's the output from attempting another push to taxomachine: https://gist.github.com/jimallman/10946715

And here's the output when I force the taxomachine on ot9 back to a previous version, then push again (once again, it stops compiling because classes are "up to date"): https://gist.github.com/jimallman/10946897

And one more time, after forcing it back to before the commit in question and clearing old .class files from /home/opentree/repo/taxomachine/target/classes/org/opentree/tnrs: https://gist.github.com/jimallman/10947236

— Reply to this email directly or view it on GitHubhttps://github.com/OpenTreeOfLife/taxomachine/issues/28#issuecomment-40672197 .

jar398 commented 10 years ago

Closing. Follow up here: https://github.com/OpenTreeOfLife/opentree/issues/264