codeforIATI / iati-data-bugtracker

🐛 A public log of issues with IATI data and metadata
https://bugtracker.codeforiati.org
3 stars 0 forks source link

USAID activity has 131,072 transactions #4

Closed andylolz closed 4 years ago

andylolz commented 4 years ago

Originally reported here: https://github.com/devinit/D-Portal/issues/533

Dataset usaid-multiple-3 includes an activity with (at time of writing) 131,072 transactions (IATI identifier US-GOV-1-720000000000.0). This activity is causing memory problems for a number of systems, including d-portal, IATI Canary, and possibly others.

It would be good to get the views of others on the solution here. Is it up to the publisher to resolve this, or is this valid, and systems should be able to cope with it? Tagging @markbrough @davidmegginson @notshi @xriss.

notshi commented 4 years ago

A hard limit would be nice - this particular activity is 80mb worth of transactions. I suppose it all boils down to how a publisher manages their data to be used.

In this case, how we've resolved it is by increasing the memory available to node during import - it is now 4GB (previously 1GB) but that doesn't mean it can handle 4 times the girth.

We are unsure how large an activity this current stacksize can accommodate - we'll have to wait and see if it falls over again.

andylolz commented 4 years ago

Looks like this activity is now gone? I don’t see the iati-identifier anymore. So I guess this is resolved (in some sense).

notshi commented 4 years ago

I suppose this could be considered resolved but we still think it's a matter of 'best practise' and limit considerations:

Could the Registry / Validator flag large uploads? Could the Standard recommend better publishing etiquette? etc.

The dataset (https://iatiregistry.org/dataset/usaid-multiple-3) is still there and although it is now at a slimmed down 37.4mb, the IATI Previewer still falls over viewing it.