IATI / ckanext-iati

CKAN extension for the IATI Registry
http://iatiregistry.org
9 stars 6 forks source link

Download Error · due to Content-length #457

Open siwhitehouse opened 3 weeks ago

siwhitehouse commented 3 weeks ago

As a publisher, I want to be informed via email when an activity file exceeds the maximum content-length so that I can resolve the problem and have my activities appear in IATI products.

Acceptance criteria When the Registry identifies that an activity file is too big and displays an error message on the data set page in the form:

Download Error · (_date_)
Content-length (_figure_) exceeds maximum allowed value 60000000

then it should also send an email to the contact email address to inform them. This email should only be sent when an update takes an activity file from below the maximum file size to above it.

That email should be similar to:

Dear {user_name}

The latest update of the {dataset_name] for {organisation} exceeds the maximum allowed size for IATI Activity files. The maximum size is 60000000 whereas your file is {file_size}.

This means your latest activity file will not appear in IATI products, such as the IATI Datastore. It also shows an error on its IATI Registry webpage .

To resolve this you should split your IATI publication into multiple activity files.

Kind regards,
IATI Support

where {dataset_name], {organisation}, {file_size} and {URL} are placeholder variables.

We need the following to be in a position to raise a pull request, please:

  1. @cormachallinanderilinx to identify where the trigger for this needs to be
  2. @cormachallinanderilinx to replace the placeholder variables
  3. @Bjwebb to review the text of the email, with specific reference to the second paragraph
  4. I will check for guidance on splitting activity files and include a link or more information as relevant.

Also cc-ing @dan-odsc and @robredpath for info and comment

robredpath commented 3 weeks ago

Thanks for this @siwhitehouse .

We'll want make sure that this email only gets sent once - we wouldn't want to email someone every day about this. Ideally, it would be a bit more nuanced than that, but we're looking at how we contact people more broadly within our work at ODS so I don't think it's worth building out any sort of complex system in the Registry.

The maximum size is 60000000

I think this should just say 60MB, and {file_size} should also be expressed in MB. A suitable algorithm might be to always round up to the next tenth of an MB, so that it's always a bigger number than the maximum size.

To resolve this you should split your IATI publication into multiple activity files.

I think that "guidance on splitting activity files" is essential here - I had a conversation just yesterday about this and I don't think it's clear how this should be carried out. I may well have missed something, mind!

siwhitehouse commented 2 weeks ago

Thanks @robredpath

I've added some additional text to the specification:

This email should only be sent when an update takes an activity file from below the maximum file size to above it.

This means that if an organisation amends an activity file to below the maximum file size they should again be notified if it subsequently goes back over, but they shouldn't get regular emails once they have breached.

@cormachallinanderilinx can you propose how this email is triggered based on this specification, please?

Searching the IATI Standard website led me to How To Create Your IATI Data Files, which only includes:

"Please ensure your IATI XML files are less than 40MB. Larger IATI activity files can be split down into multiple sub-files e.g. split by country, region or date, with each activity contained in only one file."

Which is not the specificity of advice I'd like us to be able to point to.

Bjwebb commented 2 weeks ago

The maximum size is 60000000

This is less than the 60MiB, or 62914560B listed in the draft IATI data policy ("60MB" is ambiguous and could be used to mean either).

But it's quite a bit more than the 40MB listed on https://iatistandard.org/en/guidance/publishing-data/creating-files/how-to-create-your-iati-data-files/

Bjwebb commented 2 weeks ago

This means your latest activity file will not appear in IATI products, such as the IATI Datastore. It also shows an error on its IATI Registry webpage .

This is correct for the datastore, I'm not sure about D-Portal.

cormachallinanderilinx commented 2 weeks ago

@siwhitehouse should be able to tell from the package dict if there is already a content length error If there is already and error: we ignore If there isnt one: we send email