runt18 / google-bigquery

Automatically exported from code.google.com/p/google-bigquery
0 stars 0 forks source link

Perfomance degradation x15 times since Friday 26th Feb #453

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Hi, we're facing a severe performance degradation on all our projects that is 
happening from 26th of Feb 2016.

Before Friday we were seeing that almost all the queries have been completed 
within <5 seconds time. From Friday and ongoing same queries on same datasets 
are running for 300+ seconds, which is unacceptable for our service.

We did NO change to our data or any settings of the projects, so we suspect the 
problem is on BQ side.

Details: 
- Dataset 1. Query 1. bytes processed 1.6GB: before Friday - 5-10 seconds, 
after Friday 120 seconds
- Dataset 2. Query 1. Bytes processed 4.3GB: before Friday - <20 seconds, after 
Friday 300 seconds

According to "explanation" all the time is spent on Execution phase.

Could you please let us know what happened to BQ performance and when it shall 
recover?

Original issue reported on code.google.com by drazumov...@gmail.com on 29 Feb 2016 at 3:16

Attachments:

GoogleCodeExporter commented 8 years ago
Issue 452 has been merged into this issue.

Original comment by thomasp...@google.com on 29 Feb 2016 at 5:59

GoogleCodeExporter commented 8 years ago
Hi, what are the affected project ids?  I'll take a look.

If you happen to have some "matched pair" job_ids that are querying the same 
input but run slow/fast, that would speed up investigation.

Original comment by e...@google.com on 29 Feb 2016 at 6:01

GoogleCodeExporter commented 8 years ago
Hi project ID is teva-pk001 (793717300423).

Cannot provide you matching pairs, as the most recent query I can see in Query 
History is from today, while the issue is persistent from last Friday.

Thank you!

Original comment by drazumov...@gmail.com on 29 Feb 2016 at 6:36

GoogleCodeExporter commented 8 years ago
Any feedback on that? Our work is almost totally paralyzed...

Original comment by drazumov...@gmail.com on 1 Mar 2016 at 11:27

GoogleCodeExporter commented 8 years ago
Could you please provide a specific job ID for a query that you expected to be 
much faster? Even without a "matched pair", a job ID for a slow query will help 
debug.

Original comment by jcon...@google.com on 1 Mar 2016 at 5:57

GoogleCodeExporter commented 8 years ago
JobID: teva-pk001:job_XD9L9fFN1jAGZTa_32vX5ZzkB9A

I assume that the problem is with JOINs, but before 26th those joins where 
running in seconds time

Original comment by drazumov...@gmail.com on 1 Mar 2016 at 6:24

GoogleCodeExporter commented 8 years ago
Could you reimport data for table: PharmacyChains.Turnover? And try again. 

Original comment by nitsh...@google.com on 1 Mar 2016 at 11:17

GoogleCodeExporter commented 8 years ago
Re-import is done every morning by a batch job.

Original comment by drazumov...@gmail.com on 1 Mar 2016 at 11:43

GoogleCodeExporter commented 8 years ago
Gotcha. We believe we've identified the issue, and we've made a configuration 
change that will likely improve performance once the data is reloaded.

Original comment by jcon...@google.com on 1 Mar 2016 at 11:54

GoogleCodeExporter commented 8 years ago
Magic! )
Do you think this issue can come back in future? Shall we completely withdraw 
from using JOINs?

Original comment by drazumov...@gmail.com on 2 Mar 2016 at 11:10

GoogleCodeExporter commented 8 years ago
Actually, the joins were not the issue at all--the joins in your query can 
actually be computed quite efficiently.

The problem was a change to our load jobs that caused your data to be 
partitioned in a suboptimal manner. We've reverted the change for the time 
being while we address the problem.

Please let us know if you notice other problems in the future!

Original comment by jcon...@google.com on 2 Mar 2016 at 10:34

GoogleCodeExporter commented 8 years ago
Got it! 
Thank you for prompt resolution of the issue!

Original comment by drazumov...@gmail.com on 3 Mar 2016 at 12:47