Open TimG1964 opened 1 week ago
@quinnj What's interesting is that the error doesn't happen when passing ntasks=1
to CSV.read
.
Is this the same as #1139 ?
Possibly, but hard to tell without having seen the files and/or identified the root cause.
Files are public, from the UK Department of Culture, Media and Sport, here, or by HTTP.get call to https://nationallottery.dcms.gov.uk/api/v1/grants/csv-export/
. Typically just over 300MB, but growing. Updates are relatively frequent as new grant records are added.
At least one field, Description
, is a quoted text field that sometimes contains new lines and can be quite lengthy. Only a quite small proportion of the 700,000 records contain new lines, though, unlike the file in #1139. This may be the reason the problem is intermittent and depends on sort order.
Ah sorry I hadn't noticed that #1139 includes code to generate the file.
Refer to this discussion on the Julialang Discourse:
The error described there is