plazi / BLR-website

1 stars 0 forks source link

BLR website: author is confusing and needs explanation? #48

Open myrmoteras opened 4 years ago

myrmoteras commented 4 years ago

When I use the search option "author", I am getting mainly authorities (author of a new species) and not authors of an article, which most people would expect.

We need to explain somewhere the search concepts.

For example, when I am searching for Fisher using the "All" option, I am getting anything from an author, authority to a string, such as "Fisher Hill" in a location.

punkish commented 4 years ago

so I investigated this issue and this one does seem to be a confusion with some roots in a BLR website-related issue, though not necessarily a bug (cc @teodorgeorgiev)

Part of the confusion comes from the terms author and authorityName that are used interchangeably. Note that the database does not have any field called author. When you search author (as noted in the BLR website by the radio button that you can choose under the search field), you are actually searching against the authorityName (I don't who told me and when, but someone did make believe that authorityName was the author of the treatment, so that is what I implemented; I didn't just come up with that on my own).

image

This is confirmed by the query sent to Zenodeo http://z2.punkish.org/v2/treatments?facets=true&stats=true&authorityName=fisher which causes Zenodeo to search the database for all records where the authorityName starts with fisher

Indeed, 170 records are returned and displayed in the website.

An associated confusion occurs because now when the website URL address bar in the browser is updated, it actually says http://blr.uplaysandbox.website/?facets=true&q=fisher&resource=treatments&stats=true&type=authorityName which is different from the query sent to Zenodeo. Now, I am assuming that the q=fisher&type=authorityName is a BLR-website-specific convention and nothing to do with the Zenodeo query other than that the former is translated into the latter.

Of course, there is an option to search "All" on the website which, in turn, does a Zenodeo search using the q=fisher parameter, in other words, a FTS search (in Zenodeo-world, q is always a full-text search). This results in the browser bar showing http://blr.uplaysandbox.website/?facets=true&q=fisher&resource=treatments&stats=true&type=all which translates into http://z2.punkish.org/v2/treatments?facets=true&stats=true&q=fisher and returns 1937 records.

To @myrmoteras, queries are performing as they should. That they are causing confusion in you, the user, is because the author and authorityName terms are similar but not the same. And nowhere in the database is any field called author. We have authorityName and we have treatmentAuthor. If what you mean by author is actually the treatmentAuthor, then that specific search should be modified accordingly. That is, on choosing author in the website, a search like so should be performed http://z2.punkish.org/v2/treatmentauthors?treatmentAuthor=fisher which would return 785 records (post that URL in your browser and take a look at the result).

On top of that, I would have to create a new SQL query that joins the treatments table with the treatmentAuthors table and then performs the search for treatments whose treatmentAuthors meet a given constraint. For example, when I perform such a query in the database, I do get back 785 records.

SELECT 
    treatments.treatmentId, 
    treatmentTitle,
    treatmentAuthorId,
    treatmentAuthor
FROM
    treatments 
    JOIN treatmentAuthors ON treatments.treatmentId = treatmentAuthors.treatmentId
WHERE 
    treatmentAuthor LIKE 'fisher%'
LIMIT 30
OFFSET 0;

This is considerably more complicated to do before May 14, test it, and put into production given I have other tasks on my hand, esp updating the database. For now, I will let you decide how you want to proceed.

cc-ing @tcatapano for suggestions

punkish commented 4 years ago

a comment – this is just a personal view, but I believe, it is an informed one. So, here goes…

part of the problem stems from the divergence in expectations and understanding of what the tech can do. As a domain scientist, you @myrmoteras, want to be able to do all permutations and combinations of granular searches. However, with each combo, the website and the backend get increasingly more complex. The change is not linear, it is geometric. If you really want to be able to do all kinds of searches, perhaps a website is not for you. Perhaps you should download the entire database and do searches in SQL, to your heart's content. Or write custom programs.

Of course, one can make a website and a backend that does everything you want, but it will be very difficult and expensive and will take a very long time. Basically, it will be trying to functionally replicate a database.

One of the reasons Ocellus works so well (if I may say so) is because it does very little, but what it does, it does very well. Is there room for improvement there? Of course, tons. But is it ever going to do everything someone wants? Certainly not.

We have to (you have to) decide where to draw the boundaries and how to set expectations of users. Without that, we will never be close to releasing even v1 of the website.

punkish commented 4 years ago

ok, a very late night update. Thanks to the new data-dictionary that describes configuration, and then drives the query creations, I was able to create a new queryable parameter called "author" for treatments. So, now one will be able to do /treatments?author=fisher and get results like depicted in the image grab below

image

Keep in mind, this is not the same as /treatments?authorityName=fisher. It is what one would get if one were to take all the results of /treatmentauthors?treatmentAuthor=fisher, and then find all the treatments for each one of those treatmentAuthors.

Well, this is the good news. Now the bad news.

One, this is very quickly done, and on my laptop. I have to test it well here, then I have to merge this dev branch into the master and create a new test RC, push to z2, so you all can test it. Then, @teodorgeorgiev has to modify the BLR website so that when a user chooses the "author" radio box, the search is done for treatments against author, not authorityName. Then you all have to test that out. If we hustle, we should be able to do all this by the end of the week, giving me time to make sure this goes into production before May 14.

Two, this means my work on updating the data will very likely get delayed. I will try to do that as well, but I just don't want to promise both these tasks before May 14.

Three, it is now way past my bedtime

cc @myrmoteras @tcatapano @teodorgeorgiev

myrmoteras commented 4 years ago

@punkish I am not the dictator that says it has been done now. I just observer that something is not delivering what I expect, because you asked to do some testing.

These findings then need be discussed among us what the best solution is and whether it is worth making this effort.

Right now, the important thing is that we get the BLR website up, in which I agree with @tcatapano AND we get the report done which clearly will also take some time.

punkish commented 4 years ago

I just observer that something is not delivering what I expect, because you asked to do some testing.

You are more than just an observer, and as I mentioned before, your testing is extremely good. It really does bring out the problems and issues in an application from a real user’s point of view. That is why I always take so much care to reply in detail, and then try and resolve the reported issue.

By the way, who is this Brian Fisher who has authored so many treatments?

teodorgeorgiev commented 4 years ago

@teodorgeorgiev has to modify the BLR website so that when a user chooses the "author" radio box, the search is done for treatments against author, not authorityName. ... sure ... but I do not see such possibility in the API documentation (i.e. /treatments?author=XXXX) @punkish what is the query parameter that I am supposed to use instead of authorityName ?

punkish commented 4 years ago

@teodorgeorgiev has to modify the BLR website so that when a user chooses the "author" radio box, the search is done for treatments against author, not authorityName. ... sure ... but I do not see such possibility in the API documentation (i.e. /treatments?author=XXXX) @punkish what is the query parameter that I am supposed to use instead of authorityName ?

that is because it hasn't been implemented yet. You overlooked this part in my post above :)

One, this is very quickly done, and on my laptop. I have to test it well here, then I have to merge this dev branch into the master and create a new test RC, push to z2, so you all can test it. Then, @teodorgeorgiev has to modify the BLR website so that when a user chooses the "author" radio box, the search is done for treatments against author, not authorityName. Then you all have to test that out. If we hustle, we should be able to do all this by the end of the week, giving me time to make sure this goes into production before May 14.

It is somewhat working on my laptop, but I still have to troubleshoot a few things, and test for performance. Then I have to merge that dev branch into the master, create an RC, and push it to the test server. At that point you will modify your website code to point to treatments?author=xxx (and it will also show up in the documentation). If I am lucky, I could have this done in the next couple of days.