OpenDataServices / flatten-tool

Tools for generating CSV and other flat versions of the structured data
http://flatten-tool.readthedocs.io/en/latest/
MIT License
103 stars 15 forks source link

Conversions currently happen by loading all files into memory #36

Closed Bjwebb closed 8 years ago

Bjwebb commented 9 years ago

I decided to do this as it's easier, the size of data people normally have/want in spreadsheets fits into memory, and I didn't want to be guilty of premature optimisation. It might bite us down the line though if get some particularly large files.

practicalparticipation commented 9 years ago

I've been testing the tool using the CSV file from http://lin-360giving.aptivate.org/ and the header row:

ocid,title,description,funder,fundingOrganization/id,fundingOrganization/name,recipient,recipientOrganization/id,recipientOrganization/name,currency,amountAppliedFor,amountAwarded,recipientOrganization/charityNumber,recipientOrganization/companyNumber,applicationDate/startDate,awardDate/startDate,plannedDates/startDate,plannedDates/endDate,beneficiary/country,beneficiary/location/lat,beneficiary/location/long,fromOpenCall,relatedActivity,lastModified,url,recipientOrganization/address

In order to see how the script copes with large files.

Script took around 25 minutes to run on Macbook Air (2Ghz, 8 Gb memory).

Might need to add a warning for users when selecting large files that they could take a while to process.

Bjwebb commented 8 years ago

This issue has been superseded by https://github.com/OpenDataServices/cove/issues/418