Voronchuk / elixir_google_spreadsheets

Elixir library to read and write data of Google Spreadsheets.
MIT License
79 stars 37 forks source link

Timeout #14

Closed vpfaulkner closed 6 years ago

vpfaulkner commented 6 years ago

Hello,

I'm getting the following timeout error:

erlang error: {:timeout, {GenServer, :call, [#PID<0.12345.10>, {:read_rows, 2, 250, [column_to: 26]}, 5000]}}

The read rows call I'm using is:

GSS.Spreadsheet.read_rows(pid, 2, 250 column_to: 26)

Looking in the logs, the request looks like:

https://sheets.googleapis.com/v4/spreadsheets/REDACTED/values:batchGet?majorDimension=ROWS&valueRenderOption=FORMATTED_VALUE&dateTimeRenderOption=FORMATTED_STRING&ranges=A2:Z2&ranges=A3:Z3&ranges=A4:Z4&ranges=A5:Z5&ranges=A6:Z6&ranges=A7:Z7&ranges=A8:Z8&ranges=A9:Z9&ranges=A10:Z10&ranges=A11:Z11&ranges=A12:Z12&ranges=A13:Z13&ranges=A14:Z14&ranges=A15:Z15&ranges=A16:Z16&ranges=A17:Z17&ranges=A18:Z18&ranges=A19:Z19&ranges=A20:Z20&ranges=A21:Z21&ranges=A22:Z22&ranges=A23:Z23&ranges=A24:Z24&ranges=A25:Z25&ranges=A26:Z26&ranges=A27:Z27&ranges=A28:Z28&ranges=A29:Z29&ranges=A30:Z30&ranges=A31:Z31&ranges=A32:Z32&ranges=A33:Z33&ranges=A34:Z34&ranges=A35:Z35&ranges=A36:Z36&ranges=A37:Z37&ranges=A38:Z38&ranges=A39:Z39&ranges=A40:Z40&ranges=A41:Z41&ranges=A42:Z42&ranges=A43:Z43&ranges=A44:Z44&ranges=A45:Z45&ranges=A46:Z46&ranges=A47:Z47&ranges=A48:Z48&ranges=A49:Z49&ranges=A50:Z50&ranges=A51:Z51&ranges=A52:Z52&ranges=A53:Z53&ranges=A54:Z54&ranges=A55:Z55&ranges=A56:Z56&ranges=A57:Z57&ranges=A58:Z58&ranges=A59:Z59&ranges=A60:Z60&ranges=A61:Z61&ranges=A62:Z62&ranges=A63:Z63&ranges=A64:Z64&ranges=A65:Z65&ranges=A66:Z66&ranges=A67:Z67&ranges=A68:Z68&ranges=A69:Z69&ranges=A70:Z70&ranges=A71:Z71&ranges=A72:Z72&ranges=A73:Z73&ranges=A74:Z74&ranges=A75:Z75&ranges=A76:Z76&ranges=A77:Z77&ranges=A78:Z78&ranges=A79:Z79&ranges=A80:Z80&ranges=A81:Z81&ranges=A82:Z82&ranges=A83:Z83&ranges=A84:Z84&ranges=A85:Z85&ranges=A86:Z86&ranges=A87:Z87&ranges=A88:Z88&ranges=A89:Z89&ranges=A90:Z90&ranges=A91:Z91&ranges=A92:Z92&ranges=A93:Z93&ranges=A94:Z94&ranges=A95:Z95&ranges=A96:Z96&ranges=A97:Z97&ranges=A98:Z98&ranges=A99:Z99&ranges=A100:Z100&ranges=A101:Z101&ranges=A102:Z102&ranges=A103:Z103&ranges=A104:Z104&ranges=A105:Z105&ranges=A106:Z106&ranges=A107:Z107&ranges=A108:Z108&ranges=A109:Z109&ranges=A110:Z110&ranges=A111:Z111&ranges=A112:Z112&ranges=A113:Z113&ranges=A114:Z114&ranges=A115:Z115&ranges=A116:Z116&ranges=A117:Z117&ranges=A118:Z118&ranges=A119:Z119&ranges=A120:Z120&ranges=A121:Z121&ranges=A122:Z122&ranges=A123:Z123&ranges=A124:Z124&ranges=A125:Z125&ranges=A126:Z126&ranges=A127:Z127&ranges=A128:Z128&ranges=A129:Z129&ranges=A130:Z130&ranges=A131:Z131&ranges=A132:Z132&ranges=A133:Z133&ranges=A134:Z134&ranges=A135:Z135&ranges=A136:Z136&ranges=A137:Z137&ranges=A138:Z138&ranges=A139:Z139&ranges=A140:Z140&ranges=A141:Z141&ranges=A142:Z142&ranges=A143:Z143&ranges=A144:Z144&ranges=A145:Z145&ranges=A146:Z146&ranges=A147:Z147&ranges=A148:Z148&ranges=A149:Z149&ranges=A150:Z150&ranges=A151:Z151&ranges=A152:Z152&ranges=A153:Z153&ranges=A154:Z154&ranges=A155:Z155&ranges=A156:Z156&ranges=A157:Z157&ranges=A158:Z158&ranges=A159:Z159&ranges=A160:Z160&ranges=A161:Z161&ranges=A162:Z162&ranges=A163:Z163&ranges=A164:Z164&ranges=A165:Z165&ranges=A166:Z166&ranges=A167:Z167&ranges=A168:Z168&ranges=A169:Z169&ranges=A170:Z170&ranges=A171:Z171&ranges=A172:Z172&ranges=A173:Z173&ranges=A174:Z174&ranges=A175:Z175&ranges=A176:Z176&ranges=A177:Z177&ranges=A178:Z178&ranges=A179:Z179&ranges=A180:Z180&ranges=A181:Z181&ranges=A182:Z182&ranges=A183:Z183&ranges=A184:Z184&ranges=A185:Z185&ranges=A186:Z186&ranges=A187:Z187&ranges=A188:Z188&ranges=A189:Z189&ranges=A190:Z190&ranges=A191:Z191&ranges=A192:Z192&ranges=A193:Z193&ranges=A194:Z194&ranges=A195:Z195&ranges=A196:Z196&ranges=A197:Z197&ranges=A198:Z198&ranges=A199:Z199&ranges=A200:Z200&ranges=A201:Z201&ranges=A202:Z202&ranges=A203:Z203&ranges=A204:Z204&ranges=A205:Z205&ranges=A206:Z206&ranges=A207:Z207&ranges=A208:Z208&ranges=A209:Z209&ranges=A210:Z210&ranges=A211:Z211&ranges=A212:Z212&ranges=A213:Z213&ranges=A214:Z214&ranges=A215:Z215&ranges=A216:Z216&ranges=A217:Z217&ranges=A218:Z218&ranges=A219:Z219&ranges=A220:Z220&ranges=A221:Z221&ranges=A222:Z222&ranges=A223:Z223&ranges=A224:Z224&ranges=A225:Z225&ranges=A226:Z226&ranges=A227:Z227&ranges=A228:Z228&ranges=A229:Z229&ranges=A230:Z230&ranges=A231:Z231&ranges=A232:Z232&ranges=A233:Z233&ranges=A234:Z234&ranges=A235:Z235&ranges=A236:Z236&ranges=A237:Z237&ranges=A238:Z238&ranges=A239:Z239&ranges=A240:Z240&ranges=A241:Z241&ranges=A242:Z242&ranges=A243:Z243&ranges=A244:Z244&ranges=A245:Z245&ranges=A246:Z246&ranges=A247:Z247&ranges=A248:Z248&ranges=A249:Z249&ranges=A250:Z250"

Is there anyway to either 1) up the timeout on the GenServer call (defaults to 5 seconds) or 2) perform this query more efficiently so as to not hit the timeout?

Thanks!

Voronchuk commented 6 years ago

I guess both can be useful, I'll see what can be done about it.

vpfaulkner commented 6 years ago

@Voronchuk any ideas / updates on this? As a fallback I might just catch / retry but certainly not ideal solution.

nekifirus commented 6 years ago

Hello! Any ideas?

Voronchuk commented 6 years ago

@vpfaulkner @nekifirus try to install the master version {:elixir_google_spreadsheets, github: "Voronchuk/elixir_google_spreadsheets"} make read_rows with batch_range: true option (it's disabled by default).

Also I added support for request config options, defaults are:

request_opts: [
        ssl: [{:versions, [:'tlsv1.2']}],
        timeout: :timer.seconds(8),
        recv_timeout: :timer.seconds(5)
    ]
nekifirus commented 6 years ago

Yeah! In our case

-  max_interval: :timer.minutes(100),
-  interval: :timer.minutes(100)
+  max_interval: :timer.seconds(5),
+  interval: 100
Voronchuk commented 6 years ago

@vpfaulkner have you tried batch_range: true?

vpfaulkner commented 6 years ago

@Voronchuk let me try that...

seddy commented 6 years ago

@vpfaulkner - did batch_range: true work for you?

I'm wondering because we're hitting a similar issue on write_rows, even though we've tried upping the recv_timeout.

I'm not sure I understand how having a longer timeout on the http requests is going to help if the GenServer.call still times out at 5s still @Voronchuk; won't we always hit that 5s timeout regardless of http timeout? I may well be misunderstanding how GenServer behaves though, I'm assuming it would timeout if the http takes longer than its timeout.

vpfaulkner commented 6 years ago

@seddy and @Voronchuk we are still running into the same issue even after using batch_range: true.

For full context here is our call:

   GSS.Spreadsheet.read_rows(pid, @legend_row, @ending_row, column_to: @ending_column, batch_range: true)

Once more, after we made an update to the config we started to get the following error in staging:

no function clause matches

gen.erl:150 :in `call`
lib/gen_server.ex:734 :in `call`
lib/elixir_google_spreadsheets/spreadsheet.ex:450 :in `spreadsheet_query`
lib/elixir_google_spreadsheets/spreadsheet.ex:353 :in `handle_call`
gen_server.erl:615 :in `try_handle_call`
gen_server.erl:647 :in `handle_msg`
proc_lib.erl:247 :in `init_p_do_apply`

Here is the dependency:

      {:elixir_google_spreadsheets, github: "Voronchuk/elixir_google_spreadsheets"},

Here is the config:

config :elixir_google_spreadsheets, :client,
  request_workers: 50,
  max_demand: 100,
  max_interval: :timer.minutes(1),
  interval: 100,
  request_opts: [
    timeout: :timer.seconds(60),
    recv_timeout: :timer.seconds(30)
  ]

Concerning your comment @seddy, I naively assumed that this changed the GenServer timeout if recv_timeout was set: https://hexdocs.pm/elixir/GenServer.html#call/3. If that time out is not being changed, then it would seem that having a higher http timeout would not matter.

Voronchuk commented 6 years ago

I'll take a look on weekend, thanks for an update.

vpfaulkner commented 6 years ago

@Voronchuk any update?

Voronchuk commented 6 years ago

sorry, I'm currently busy with other stuff, will try to look on Friday / weekend.

seddy commented 6 years ago

Sorry to pester @Voronchuk - any update? This is causing a reasonable number of errors for us so I'm happy to do a PR which will allow us to set the Genserver timeout appropriately if that helps?

vpfaulkner commented 6 years ago

@seddy same here. If you already have an idea for what the fix might look like, would you mind opening that PR? We might need to use a forked version that has a fix in it if this doesn't get resolved soon.

Voronchuk commented 6 years ago

batch_range: true is an experimental feature which works only for read_rows, it's not related to writes in any way.

@vpfaulkner can you show Google API url which was generated for your request? In my env batch_range: true works for read_rows.

Voronchuk commented 6 years ago

I @vpfaulkner @seddy have added test for this case and got the following query: 14:11:44.712 [debug] send_request https://sheets.googleapis.com/v4/spreadsheets/XXXX/values:batchGet?majorDimension=ROWS&valueRenderOption=FORMATTED_VALUE&dateTimeRenderOption=FORMATTED_STRING&ranges=A2%3AZ250

It works as intended, request looks like GSS.Spreadsheet.read_rows(pid, 2, 250, column_to: 26, batch_range: true)

Voronchuk commented 6 years ago

As for the custom timeout, they are already supported by :result_timeout setting.

case options[:result_timeout] || config(:result_timeout) do
          nil ->
            GenStage.call(__MODULE__, {:request, request})
          result_timeout ->
            GenStage.call(__MODULE__, {:request, request}, result_timeout)
        end
Voronchuk commented 6 years ago

Also, keep in mind the new default config setting max_rows_per_request:

config :elixir_google_spreadsheets,
    max_rows_per_request: 301
vpfaulkner commented 6 years ago

@Voronchuk sorry for the late reply. I think the batch_range: true is working as expected from what I can tell in the URL. I also have result_timeout set: result_timeout: :timer.minutes(10). However, the error I keep seeing (see below) continues to occur 5 seconds after the job begins and shows the GenServer default timeout of 5 seconds. I'm confused how result_timeout could increase the GenServer timeout and that appears to be the timeout causing this issue. It appears that the GenServer call is here: https://github.com/Voronchuk/elixir_google_spreadsheets/blob/master/lib/elixir_google_spreadsheets/spreadsheet.ex#L123 and no third argument is passed for the timeout.

Apr 16 11:00:00 ingredient-planner-production web-10.53.44.9: 11:00:00.348 [info] Importing protein assignments... 
Apr 16 11:00:05 ingredient-planner-production web-10.53.44.9: 11:00:05.351 [error] Task #PID<0.31038.5> started from Quantum terminating 
Apr 16 11:00:05 ingredient-planner-production web-10.53.44.9: ** (EXIT) time out 
Apr 16 11:00:05 ingredient-planner-production web-10.53.44.9: ** (stop) exited in: GenServer.call(#PID<0.31040.5>, {:read_rows, 2, 250, [column_to: 26, batch_range: true]}, 5000) 
Apr 16 11:00:05 ingredient-planner-production web-10.53.44.9: (elixir) lib/gen_server.ex:737: GenServer.call/3 
Apr 16 11:00:05 ingredient-planner-production web-10.53.44.9: (ingredient_planner) lib/ingredient_planner/import_protein_assignments.ex:21: IngredientPlanner.ImportProteinAssignments.run/0 
Apr 16 11:00:05 ingredient-planner-production web-10.53.44.9: (elixir) lib/task/supervised.ex:85: Task.Supervised.do_apply/2 
Apr 16 11:00:05 ingredient-planner-production web-10.53.44.9: (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3 
Apr 16 11:00:05 ingredient-planner-production web-10.53.44.9: (elixir) lib/task/supervised.ex:36: Task.Supervised.reply/5 
Apr 16 11:00:05 ingredient-planner-production web-10.53.44.9: Function: &Quantum.Executor.execute/2 
Apr 16 11:00:05 ingredient-planner-production web-10.53.44.9: Args: [{"0 * * * *", {IngredientPlanner.ImportProteinAssignments, :run}, [], :utc}, %{d: {2018, 4, 16}, h: 11, jobs: [import_proteins: %Quantum.Job{args: [], name: :import_proteins, nodes: [:nonode@nohost], overlap: true, pid: #PID<0.31027.5>, schedule: "0 * * * *", state: :active, task: {IngredientPlanner.ImportProteinAssignments, :run}, timezone: :utc}, send_daily_digest: %Quantum.Job{args: [], name: :send_daily_digest, nodes: [:nonode@nohost], overlap: true, pid: #PID<0.31028.5>, schedule: "0 9 * * *", state: :active, task: {IngredientPlanner.SendDailyDigestJob, :send_daily_digest}, timezone: :utc}, update_recipe_forecasts: %Quantum.Job{args: [], name: :update_recipe_forecasts, nodes: [:nonode@nohost], overlap: true, pid: #PID<0.31029.5>, schedule: "0 7 * * *", state: :active, task: {IngredientPlanner.UpdateRecipeForecasts, :update}, timezone: :utc}], m: 0, r: 0, w: 1}] 
Voronchuk commented 6 years ago

Finally I got your issue @vpfaulkner I think it should be solved not with GenServer timeout increase but with async Task response, some work for this was done in scope of request_async but was never finished, I'll think how to implement this in a better way.

Voronchuk commented 6 years ago

As ad-hoc solution, you can try timeout option in scope of #24

vpfaulkner commented 6 years ago

@Voronchuk testing the timeout option in staging...

vpfaulkner commented 6 years ago

@Voronchuk sorry, forgot to reply, but it works. Thanks!