Closed GoogleCodeExporter closed 8 years ago
TL;DR: Yes, imports could be faster; however, stopping at the first bad record
wouldn't make that happen.
So there are two issues here:
* The actual time spent processing records is a tiny fraction of the total job time you see; much more is spent copying data and letting a job progress through the various parts of our system. Stopping at the first bad record wouldn't change the total job time in any appreciable way.
* We purposely go past max_bad_records before returning the error_stream to you, up to some bound (which I believe is 100 when max_bad_records is 0). This is by design -- if you have a file with, say, 3 bad records, it's a little annoying to have to correct/upload/submit job/wait/see an error for each of those, when we could have given you all that information on the first go. As mentioned, this doesn't noticeably change runtime, so it seems like a win.
I'm going to close this as "working as intended" -- feel free to reopen or
transfer to bigquery-discuss if you want to know more.
Original comment by craigcitro@google.com
on 6 Apr 2012 at 6:03
Original issue reported on code.google.com by
net.equ...@gmail.com
on 6 Apr 2012 at 5:13