Open hsenag opened 11 years ago
I can't find another issue for this either, although I think we have discussed it. I think the problem is that the authority's and user's request lists are generated via the search index, which is only updated every 5 minutes. In general everything that is dependent on search will be subject to these delays. These lists could be generated straight from the database, which would avoid this problem (in fact there are comments in the code in both places pointing this out).
For what it's worth, (and I know this has been said before :smiley:) this would be a good thing in any case to do in terms of moving Alaveteli to be a more conventional Rails application, relying more on the database and less on xapian for generating pages.
I think that in turn would help us in the long term in making use of the built-in support for caching.
Just as a small devil's advocate - what is a list of requests to a body but a search with a query parameter of that body? Having the same search index generating both does seem sensible, because then it's much easier to go from there to e.g. filter to "requests to this body containing the word cat" or whatever, it's just adding a term to the search query. OTOH, as you say, having the initial lists from the db would mean you could run the site without search more easily. If the real issue is only updating the search index every 5 minutes, could that be switched to a more real-time queuing mechanism of some sort? (In Haystack, for example, I'd look at using one of the queued apps suggested on http://django-haystack.readthedocs.org/en/latest/other_apps.html , sorry I don't know if you have similar options here.)
I agree, conceptually, it could come from either. There are a couple of practical problems with it coming from xapian - one is the delay in indexing, another is that paginated sets of results with a high offset seem very slow.
A user reports seeing a delay of (at least) several hours before a new annotation on an existing request appeared on the authority summary page - I believe it's supposed to be the most recent content that appears in the column on the right for each request. The user was able to see the annotation by doing a search on the authority for specific dates, which again points to some kind of caching issue.
I had a quick go at reproducing and my annotation appeared within 5-10 minutes as would be expected given the above description of caching.
Could there be another layer of issue with the HTTP headers being returned - e.g. promising too long an expiry for pages or similar? It just occurs to me that I instinctively Ctrl-Refresh when testing caching issues to try to rule that out, but of course that's not the normal user behaviour.
It's possible that the user had a version of the authority page cached from before they logged in (the anonymous browsing version of the page would have a long expiry header set on it - 24 hrs). Once they're logged in, the page cache header should be set to 'private', so it should not be cached.
At the time of writing it appears only requests made over about six hours ago are appearing on user and authority pages on WhatDoTheyKnow.
Comment prompted by a user writing to us to note their request hadn't appeared.
Thanks for raising @RichardTaylor - have traced this back to a deployment problem - the cron jobs for the site weren't running - should be fixed now.
Comment from a WhatDoTheyKnow user:
I have submitted a request ... and it hasn't appeared on the site. Is there usually a delay, or should I send it again
by the time we came to deal with their email the request had appeared as expected. If a new request doesn't appear on user and body pages instantly it can be disconcerting for users.
Adding a +1 as a user contacted the WDTK admin team shortly after making their request because they could not find it on the authority/user page.
(I recall this from a few days ago, I can't find the thread at the moment).
+1
A WhatDoTheyKnow user writes:
Tried to check on my FOI requests ... but couldn't see how to access them?
This was sent to us shortly after they had made requests so I suspect it was a caching issue.
Just to note this is still a problem, newly made requests are not appearing instantly on user and body pages. If you've just made a request and then look at your user page or the relevant public body page and don't see your request listed you quite reasonably might question if something has gone wrong.
I suspect this issue could be contributing to the duplicate requests that we see. It wouldn't be unreasonable for a user to see their request missing from the user/body list, and assume something has gone wrong send it again. Ticket on catching duplicate requests: https://github.com/mysociety/alaveteli/issues/4160
I'm currently looking at a user page from the perspective of a non-logged in user and a request made 2 hours ago is yet to appear.
I'm looking at a body page https://www.whatdotheyknow.com/admin/bodies/92583 from the perspective of a non logged in user and requests I made requester_only 19 hours ago are yet to disappear.
On WhatDoTheyKnow, when I submit a new request to a public body, it only appears on the authority's request list and the user's request list after a few minutes.
I guess this is some kind of caching issue but it's disconcerting as it seems the request may have been lost.
I'm sure I remember this kind of thing being discussed before, so apologies if this is a duplicate - I searched but couldn't find anything.