Closed saramsey closed 3 years ago
These are just the numbers as presented in the JSON. I would suggest that the GUI has no idea what precision is appropriate. But rather the code that is computing those numbers has the best chance of understanding what precision is appropriate and should round accordingly. I suspect if that code rounds appropriately and puts that in the JSON, it will automatically be displayed right on the screen.
If you push back, we can certainly do this in the presentation layer, but that doesn't seem like the best place to put it.
Are there times when both 0.94 and 0.9998 might be appropriate?
Yeah, these are just the numbers that @finnagin ML model spits out. IMHO, 5 significant digits is sufficient. Can push a fix if urgent.
@edeutsch wrote:
I would suggest that the GUI has no idea what precision is appropriate.
I respectfully disagree. In a GUI meant for a human, we (the users) probably want to see no more than three significant digits. Cases where I want to see more digits than that are relatively rare for this kind of statistical summarization/estimation. (Yes I am aware that in other disciplines there are measurements like speed of light where we care about precision to 10 digits). We can have the precision configurable in the UI if someone really wants more (Even better -- just default to three, and have a "click to show all digits".). I think it would be sub-optimal if the reasoning code pre-truncated its significance estimates based on an assumed number of desired significant digits.
I assume we are all on the same page that when I write "significant digits", I mean like this:
I realize that reasonable people can disagree on this. But having the reasoning layer pre-truncate digits is imposing information loss, which goes against a basic principle that we don't needlessly throw away information prematurely. The GUI is not the main consumer of these estimates. If we feel that we must convey precision (and that is a reasonable point of view), I think the appropriate thing to do is to provide a separate edge attribute that numerically conveys precision.
AFAIK, all the various R packages for statistical estimation do not truncate their estimates based on standard error. It's assumed that that will be done in the presentation layer.
okay, I'm happy to go along with this. @isbluis want to implement this?
Sure thing.
Is there a full list of entities for which we should do this? e.g. Jaccard index...
@isbluis edges that have attributes with the following names:
probability_drug_treats
jaccard_index
ngd
paired_concept_freq
observed_expected_ratio
chi_square
Great, thanks @dkoslicki
Can we just put the fix in devLM for the time being? I'm scared of something breaking in /beta, this close to the demo. (No disparagement implied; just asserting Murphy's Law).
Sure thing, though I have a medical appointment in a few minutes and won't be able to do this until tonight.
From my standpoint this is not urgent. LM, please take care of yourself.
Ok, this is ready in devLM. (
It only works on the edge_attributes that @dkoslicki listed. Do we also want to do the same for other values, such as weight, confidence, etc?
Edge attributes look good to me! https://arax.rtx.ai/devLM/?m=1552
And for other attributes, I would say yes e.g. here: https://arax.rtx.ai/devLM/?m=1553
Ok, I have added weight and confidence as well.
Let me know if it passes muster, and I'll push the changes (if desired)
3 significant figures it is! Looks fancy!
push it!
Ok, pushed to demo/
tested and rolled out to beta. Nothing like night-before code rollouts. Would be good to test the examples 1,2,3,4 again in the morning. Seems to be in good shape to me!
thanks!
Hi @dkoslicki , @saramsey ,
@isbluis edges that have attributes with the following names:
probability_drug_treats jaccard_index ngd paired_concept_freq observed_expected_ratio chi_square
It looks like ngd
has been renamed to normalized_google_distance
. Or perhaps a new metric? Either way, I suppose I should also add it to the list of values to truncate, but let me know otherwise. Sorry if I missed this.
Any others?
Closing this unless there are other attributes that require truncation.
nb. normalized_google_distance
was added to the list with commit e1251c83d92af16f808cbf391e4b298b2b9641b3