Closed davidjennings closed 12 years ago
The problem with (1) is that the speed issue is related to the host we are collecting the data from.. and it can take a long time... Which means doing what you suggest about "Immediate" simply defers the problem to how do we write a HTML interface to represent a long-running transaction which might never terminate. So, yes "Add to queue" is perhaps more accurate, but isn't actually accurate, in that the real problem here is that the system to system interface is potentially long running and not particularly compatible with a HTML user session (Given for example, the OU course catalog)
I'll look at 2.. most of the issues around here seem to come from a perspective that harvesting a feed is an atomic transaction that can be completed almost instantly. The problem is that this isn't the case. The most common case is that the harvest will take 1-2 mins, and the edge cases can run to hours.
Because the checking is a background task, we don't really know when the next harvest will be, so it's hard to say "Will be harvested in HH:MM:SS.. this is because we have to run the feeds in series and due to the indeterminate time it takes to run each feed we can say "The next harvest will be in 5 mins" but if a particular job is 5th in the queue, and jobs 1,2,3 and 4 take 5 hours, Job 5 will run in 5 hours and 5 mins..
There are 2 potential solutions here.. (1) change the hardware and networking requirements so we can run all jobs in parallel. This will let us tell users when jobs will start, although the end time is still entirely dependent upon the system we are harvesting from. (2) Figure out the right words to explain this to users.
Maybe theres a middle ground, but thats the problem as I see it from this end.
This is all helpful explanation. The main point seems to be that if "users will want to dip into this part of the system, check their data and move on", as I suggested, the hard truth is that they won't be able to for sound technical reasons. Fair enough.
Bearing in mind that uncertainties remain in solution (1), I suspect solution (2) may be the only really practical one.
If there are uncertainties such as the variation between 1-2 mins and 5+ hours, then the dialogue with the users has to prepare them for this, rather than just leaving them to find out the hard way. Need to be careful about expectations set by statements like "it should be picked up within the next 5 minutes"; old adage about better to under promise and over deliver. So, in this case, a warning message along lines of "This step depends on systems and data that are not part of this service, so it can take several hours (though in most cases it is completed in a matter of minutes".
Users need to be given the feedback and info to determine things like (a) if there are errors that have prevented harvesting (b) how and where should they check for and diagnose errors (c) what feedback is available about the kinds of errors
The best way to communicate this is probably not via one long manual-style explanation, but through providing relevant prompts and guidance at each stage of the process...
Could use some specific suggestions here.. not sure how to improve things..
I think we're agreed that the only way forward in the short term is to improve the feedback to users. I suggest:that the message displayed to users after the 'collect' button has been pressed AND the message displayed when the feed interval is changed should read:
" Feed x has been added to the queue for collection. This step depends on systems and data that are not part of the aggregation service, so it can take several hours (though in most cases it is completed in a matter of minutes)".
Yes, I'd agree with that, with first a minor alteration "Feed [or data feed if that is the term being used elsewhere] x has been added to the queue for collection. This step depends on systems and data that are not part of the Aggregator [for consistency and clarity - some users might not know what "aggregation service" is, but it says Aggregator in big letters at the top of the page], so it can take several hours (though in most cases it is completed in a matter of minutes)"."
But, second, could we add a link to a pop-up FAQ or similar e.g.
How frequently are data feeds collected? We run collection requests in the order we receive them. When several users are making requests at once, this may mean that you have to wait until other users' collection requests have been completed. However, in many cases your request may be initiated almost immediately.
How long does collection take? Because this depends on factors that we can't anticipate in advance of your request (such as the amount of data and data transfer speeds) we cannot give a firm estimate of this. Exceptionally it may take hours, but more often it will be a matter of a few minutes.
How will I know when an attempt has been made to collect my data feed? The time (and date) of the most recent collection of your data feed will be shown on the home page under "Last Check"
How will I know when collection of my data feed has been successful? [I'm not sure what the correct answer to this is]
Any thoughts about changing the "collect" button to "request collection" or "add to queue for collection"? As with all these points, in the absence of being able to observe or talk to real users trying out the aggregator, I am not wedded to a particular solution. I'm just trying to anticipate areas where they might get confused or stuck, and this seemed like one such area to me - my suggestions are attempts to make the workings more transparent and thereby give users more agency in understanding what's going on and what they can do...
At the moment there seems to be a high chance of this dialogue leaving users stuck and without the necessary information to enable them to get themselves unstuck. I recognise that the above account may contain some misunderstandings of what the actual process is -- but if that's the case, I think the misunderstanding are a symptom of how confusing the dialogue is.