ncbo / bioportal-project

Serves to consolidate (in Zenhub) all public issues in BioPortal
BSD 2-Clause "Simplified" License
7 stars 5 forks source link

cacheing and other issues on new submissions #193

Open graybeal opened 3 years ago

graybeal commented 3 years ago

(This may be a duplicate, wanted to capture current impressions.)

when a new submission is processed, there are a number of BioPortal behaviors that strong impact the user experience.

One set of behaviors I think of as "cacheing delays and bugs in BioPortal". I have found that the fact the new ontology has been successfully processed is not known to the UI, and sometimes not even to the API. This results in "ontology unknown" messages in the API, and "there is a problem with this ontology" in the classes UI tab (and no change at all in the ontology summary UI page). Sometimes property tab is updated, often it is not or also shows an error. If the user immediately hits reload on getting "there is a problem", the system often displays a 500 error (from that point forward).

These issues seem to be resolved by clearing the 4 caches using the Administrator page. Typically immediately after that the new content is visible in the user's UI; sometimes that cache also has to be refreshed before the new content appears.

Unfortunately the "there is a problem with this ontology" message is also what the system produces when there has been a parsing error. Often there is no indication that a parsing indication has occurred, other than the summary page not showing anything other than 'Uploaded' in the status of the submission, and there is no way to tell whether the problem is simple cacheing delays or a permanent issue.

The other odd result is that the visibility of the changed data is all over the map, which may or may not be entirely caused by the cacheing issues. The data API often shows the new data first, often within a few minutes of parsing completion. The properties are often next, classes after that, and metrics and the parsing string entries on the summary page often much later.

Ideally the first thing to fix would be cacheing, such that caches (at least those related to the parsed ontology) are cleared as soon as the parsing process finishes, and ideally after each step of the parsing process finishes, so that the parsing results are immediately visible.

The second improvement would be for the parsing log to be visible to the submitter. This would save a lot of pain for them and back-and-forth help by us.

The third would be for any UI situation in the classes or properties tab that is not nominal to be reported as to the cause, instead of "there's a problem". These displays should look at the parsing flags and be able to report whether parsing is In Progress (show what steps succeeded), Errored (show what steps succeeded), and provide the time of last entry in the parsing log. A tab that shows the last log would be welcome.

Finally, if it isn't fixed by the above changes, figure out how to make the summary page content update as soon as that content is available—e.g., show the metrics as soon as the metrics task is completed and its content available, and ditto for the submissions entry.

(Also make the log show the source of the ontology that is being parsed.)

graybeal commented 2 years ago

See also https://github.com/ncbo/bioportal-project/issues/62 and https://github.com/ncbo/ontologies_api_ruby_client/issues/2, and possibly https://github.com/ncbo/ontologies_api_ruby_client/pull/10

graybeal commented 2 years ago

A weird variation today with HUSAP ontology was that clearing the internal 'Flush UI cache' would cause the submission of the ontology to (newly) appear in the list when the user was logged out, but when the user would logged in, it would be gone from the list. (The user had two tabs open, and I think they were interacting somehow.) Eventually it was possible to Flush UI Cache and the user was logged in but still couldn't see the new submission, even after a CMD-OPT-I forced reload of the page and flushing and syncing of HTTP caches! Meanwhile I could see the new submission just fine whether logged in or logged out. Wasn't sure where the logged-in session was getting its "submission-less" view after the caches were cleared, maybe from GOO cache? (Didn't clear it.)

alexskr commented 10 months ago

UI doesn't show recently added submissions for logged in users. If that user logs out then UI displays up to date information; if that user logs right back in then UI starts displaying out-of-date information. API seem to respond with correct data

Clearing http cache resolves this inconsistency so http cache seem to be the culprit.

syphax-bouazzouni commented 10 months ago

UI doesn't show recently added submissions for logged in users. If that user logs out then UI displays up-to-date information; if that user logs right back in then UI starts displaying out-of-date information. API seems to respond with correct data

Clearing HTTP cache resolves this inconsistency so http cache seem to be the culprit.

It's funny because we discussed this exact issue with @jonquet, some days ago. it is relatively a high priority in our case. Do your team have any indication on how to resolve this?

And is there any documentation, explaining the difference between these different cache types: image

alexskr commented 10 months ago

This ongoing issue is becoming more troublesome and requires closer attention.

It would be beneficial to create documentation for cache types in the OntoPortal administrator's guide, as we currently lack consolidated documentation.