mysociety / alaveteli

Provide a Freedom of Information request system for your jurisdiction
https://alaveteli.org
Other
387 stars 195 forks source link

Message to users considering making very long follow ups #3387

Open RichardTaylor opened 8 years ago

RichardTaylor commented 8 years ago

On WhatDoTheyKnow problems which take up administrator time are often not with request/response content but extraneous material users include in correspondence.

To try and encourage the focusing of requests on simply describing the information sought how about a message to those whose request is over x words urging them to keep it focused?

A stronger action would be a cap on the length of requests (and perhaps follow up messages).

Focusing requests might help with the overall reputation of WhatDoTheyKnow and other Alaveteli sites.

If in the future there was an option to moderate requests - excessive length could be a factor for putting a request into a moderation queue.

RichardTaylor commented 6 years ago

In May 2018 a member of the WhatDoTheyKnow team attended the Scottish information Commissioner's conference.

They fed back:

the most problematic requests from a FOI officer's perspective are long, rambling ones, often on irrelevant matters, with some poorly-formulated request in the middle somewhere. They say that when they get a request in bullet points, they are really happy, cos it helps concentrate the request into bits of information they actually want.

One made an interesting suggestion: that if the request text goes over a certain length, Alaveteli could put up a confirmation page pointing out that requests should be as specific and concise as possible, and only include material required in order for the public body to identify and produce the information requested. I.e. our "focussed" help text. It should allow the request to be sent, however, because some requests do need to be that length

RichardTaylor commented 5 years ago

Another option would be to require admin approval before a user can make a request of over x words long.

This might increase admin workload a little in authorisations; but might help prevent misuse of the service.

garethrees commented 5 years ago

The first step is identifying long requests, so we may as well do the initial idea first. Requiring admin approval requires #75.

garethrees commented 5 years ago

I think the first thing we should do here is write a script to get a baseline on outgoing correspondence length. Probably average, median, 95th and 99th percentile, like New Relic.

Then, an easy first step would be to have some Javascript that monitors the length of correspondence and adds a warning at "long" and "very long" intervals:

screen shot 2018-10-31 at 15 25 53 screen shot 2018-10-31 at 15 25 53 copy

I haven't thought about the exact messages yet, but we can figure that out when we come to implement.

After we've added these we should revisit in 6 months to check against the baseline data.

I don't think we should be worried about a non-JS version of this, as its not vital to functionality.

I don't think we should prevent long requests altogether, as there could well be legitimate reasons for being detailed.

I don't think we should seek approval, as we don't really have the ability to hold requests as pending (though we could store them as drafts) and it would increase admin workload.

It hasn't been mentioned here, but I don't think we should worry about using a reputation score that allows greater length yet, but it could be a next step.

garethrees commented 5 years ago

Stats:

# https://github.com/bkoski/array_stats
Float.class_eval do
  # Returns true if a float has a fractional part; i.e. <tt>f == f.to_i</tt>
  def fractional_part?
    fractional_part != 0.0
  end

  # Returns the fractional part of a float. For example, <tt>(6.67).fractional_part == 0.67</tt>
  def fractional_part
    (self - self.truncate).abs
  end
end

Array.class_eval do
  # Returns the sum of all elements in the array; 0 if array is empty
  def total_sum
    self.inject(0) {|sum, sample| sum += sample}
  end

  # Returns the mean of all elements in array; nil if array is empty
  def mean
    if self.length == 0
      nil
    else
      self.total_sum / self.length
    end
  end

  # Returns the median for the array; nil if array is empty
  def median
    percentile(50)
  end

  # https://github.com/bkoski/array_stats/blob/6cc1ba4a6cd2903d6c632589713c73db7cd7cd8b/lib/array_stats/array_stats.rb
  def percentile p
    sorted_array = self.sort
    rank = (p.to_f / 100) * (self.length + 1)

    return nil if self.length == 0

    if rank.truncate > 0 && rank.truncate < self.length
      sample_0 = sorted_array[rank.truncate - 1]
      sample_1 = sorted_array[rank.truncate]

      (rank.fractional_part * (sample_1 - sample_0)) + sample_0
    elsif rank.truncate == 0
      sorted_array.first.to_f
    elsif rank.truncate == self.length
      sorted_array.last.to_f
    end
  end
end

correspondence_counts = OutgoingMessage.pluck(:body).map(&:length).sort

correspondence_counts.mean
# => 800

correspondence_counts.median
# => 506.0

correspondence_counts.percentile(95)
# => 2257.0

correspondence_counts.percentile(99)
# => 4856.0
garethrees commented 5 years ago

Reopening as we only dealt with new requests in https://github.com/mysociety/alaveteli/pull/4987.

We'll let this sit for a few months and review whether its had an effect. If it has, then we can apply it to followups too.

garethrees commented 2 years ago

Running the stats again today (same time period - 1 year, with a bit of minor overlap for ease):

correspondence_counts =
  OutgoingMessage.where(created_at: Time.parse('2018-01-01')..Time.parse('2019-01-01')).pluck(:body).map(&:length).sort

correspondence_counts.mean
# => 894
correspondence_counts.median
# => 581.0
correspondence_counts.percentile(95)
# => 2414.0
 correspondence_counts.percentile(99)
# => 4857.0
correspondence_counts =
  OutgoingMessage.where(created_at: Time.parse('2019-01-01')..Time.parse('2020-01-01')).pluck(:body).map(&:length).sort

correspondence_counts.mean
# => 882
correspondence_counts.median
# => 603.0
correspondence_counts.percentile(95)
# => 2275.0
correspondence_counts.percentile(99)
# => 4935.0

Looks like there has been a little bit of a reduction overall, but all pretty similar.