zendesk / zendesk_api_client_rb

Official Ruby Zendesk API Client
http://developer.zendesk.com/
Apache License 2.0
388 stars 183 forks source link

Exceptions due to unexpected Faraday::Multipart::CompositeReadIO instances #513

Closed jeremyjaybaker closed 1 year ago

jeremyjaybaker commented 1 year ago

Hello, we just recently updated from 1.26 to 1.38.0.rc1 due to our other dependencies having a hard requirement of Faraday >=2.0. This seemed to go fine for the most part, but we're now getting the following exception for around <10% of our tickets and has not been reproducible outside of production.

NoMethodError: undefined method `key?' for #<Faraday::Multipart::CompositeReadIO:0x000055c1a8da7248>
  from zendesk_api/middleware/request/upload.rb:25:in `set_file'
  from zendesk_api/middleware/request/upload.rb:12:in `call'
  from zendesk_api/middleware/request/etag_cache.rb:22:in `call'
  from faraday/middleware.rb:17:in `call'
  from faraday/middleware.rb:17:in `call'
  from faraday/middleware.rb:17:in `call'
  from faraday/middleware.rb:17:in `call'
  from faraday/middleware.rb:17:in `call'
  from zendesk_api/middleware/response/parse_iso_dates.rb:11:in `call'
  from zendesk_api/middleware/response/logger.rb:22:in `call'
  from zendesk_api/middleware/response/callback.rb:14:in `call'
  from faraday/middleware.rb:17:in `call'
  from zendesk_api/middleware/response/raise_error.rb:8:in `call'
  from faraday/multipart/middleware.rb:28:in `call'
  from faraday/rack_builder.rb:153:in `build_response'
  from faraday/connection.rb:445:in `run_request'
  from faraday/connection.rb:281:in `post'
  from zendesk_api/actions.rb:32:in `save!'
  from zendesk_api/actions.rb:153:in `block in create!'
  from zendesk_api/actions.rb:152:in `tap'
  from zendesk_api/actions.rb:152:in `create!'
  from zendesk_api/collection.rb:64:in `block (2 levels) in <class:Collection>'
  from app/services/integrations/zendesk/zendesk_service.rb:74:in `create_ticket'

Our ZendeskService#create_ticket mentioned in the call stack calls the Zendesk API client directly like so:

client.tickets.create!(
  external_id: "12345foo",
  requester: #<ZendeskAPI::User>,
  organization: #<ZendeskAPI::Organization>,
  subject: "ACTION REQUIRED: Unable to process payment for Order #XXXXXX",
  description: "Hi Jane Doe,\r\n\r\nThank you for your purchase! Our system was unable to process payment for Order #XXXXXX, as your Bank ending in XXXX has been removed. \r\n\r\nAt your earliest convenience, kindly update your credit card on file via your Dashboard at Settings &gt; Payment &gt; Add a credit card. From there, we will be able to proceed with your order as quickly as possible.\r\n\r\nPlease respond to this email once this has been completed or if you have any questions!\r\n\r\nSincerely,\r\nTeam Order\r\n",
  custom_fields: [
    {:id=>"1234", :value=>"14"}, 
    {:id=>"1234", :value=>""}, 
    {:id=>"1234", :value=>"order_edit__minimum_not_met"}, 
    {:id=>nil, :value=>"ACTION REQUIRED: Order #XXXXXX substitutions"}, 
    {:id=>"1234", :value=>"Music & Books (MO)"}, 
    {:id=>"1234", :value=>"Demo Org"}
  ],
  tags: ["ops_email"],
  email_ccs: []
) 

As you can see, we're not explicitly using multipart or even referencing files/uploads so we're not sure why a percentage of our tickets would fail like this. Any idea why we're seeing this exception?

ecoologic commented 1 year ago

Thank you @jeremyjaybaker for using the rc version and contributing with your findings.

I suspect this is not related to the size of the request, but rather the size of the response. That set_file you see in the stacktrace checks if there is any file in the response, it runs either ways, even for small payloads.

I will raise this with the team and try to get it addressed promptly.

  1. Could you please clarify which exact version of Faraday you are using, from your Gemfile.lock?
  2. Do you have any reason to think a large payload would be returned by your ticket creation? Can you exclude that categorically?
jeremyjaybaker commented 1 year ago

Thanks for taking a look at this so quickly. I've gathered some further exception data over the weekend that I'll be using to try to reproduce the issue today. I'll send you the ::create! parameters if I can make this break consistently.

jeremyjaybaker commented 1 year ago
  1. Faraday 2.7.2
  2. I believe yes, we can exclude that. I don't see any large payloads/files being either sent or received. If this is actually what's happening, it's under the hood and not intentional on our part.

So we logged the captured ::create! parameters that led us to the exception, but when we plug them back into our code in a local environment, the exception doesn't trigger. We do however know that there's some degree of repeatability here since our ticket creation runs on delayed job. These jobs get retried and can fail up to 5 times before retries stop and we're seeing 5 failed attempts on these tickets, all with the error I initially posted.

jeremyjaybaker commented 1 year ago

Actually we were just able to repro it and it looks like it's a problem with the specific Zendesk user being referenced in the request. If I create an arbitrary user the ticket gets created, but if I use one of our existing customer accounts it breaks. I'm going to look into this more tomorrow but at least that's a lead

jeremyjaybaker commented 1 year ago

There's a workaround for now. Instead of calling

client.tickets.create!(
  external_id: "12345foo",
  requester: #<ZendeskAPI::User>,
  organization: #<ZendeskAPI::Organization>,
  ...

you can just use ids directly (probably what I should've done in the first place), ie

client.tickets.create!(
  external_id: "12345foo",
  requester_id: z_user_id,
  organization_id: z_org_id,
  ...

As far as we can tell, this exception is so flaky because it only occurs for certain users. The users that trigger the exception have a larger response payload than other users that include data like thumbnail references. This is basically what you said initially, I'm just confirming. The use of Faraday Multipart here makes sense due to a larger payload, it just needs to be supported? Either way, using ids seems to also cause a smaller response payload to be returned and the exception does not trigger.

Feel free to leave this open or close it as you see fit, but I'd definitely like to implement any updates on this as soon as you have them.

ecoologic commented 1 year ago

Hi @jeremyjaybaker:

The users that trigger the exception have a larger response payload than other users that include data like thumbnail references.

would you please help me reproduce a user that have a big payload?

PS: IT is interesting that using _id works, because from the API perspective, I don't think there's a difference in payloads between using ids or full objects, we should always send ids only. My best guess is that the large objects are only in local memory and the the multipart is potentially un-necessary.

jeremyjaybaker commented 1 year ago

Sure, I'll try to get you some VCR cassettes today for a failed request. Not sure what else I can do beyond that given that you'd need our keys to access some of our users with real requests.

ecoologic commented 1 year ago

Thank you @jeremyjaybaker - just verifying and understanding how the big payload is composed, eg: you mentioned a thumbnail, but I'm not sure where that fits in your objects.

Thanks

ecoologic commented 1 year ago

@jeremyjaybaker - did you have any chance to find these examples for us? Thank you.

jeremyjaybaker commented 1 year ago

Apologies for the delay. The specific user accounts that were causing the exception tended to have extra data attached to them like thumbnail images, yes. The smaller the user object, the more of a chance that it would not trigger the exception it seemed.

I was running into some issues with getting you the requests, but I'll give that another shot today.

jeremyjaybaker commented 1 year ago

@ecoologic Here's a repro template. It contains no sensitive data so feel free to share. Run it with ruby zendesk_spec.rb. It'll give you a passing test case where IDs are used and a failing test case where whole objects are used.

zendesk_bug_repro.zip

fbvilela commented 1 year ago

Thank you, @jeremyjaybaker for reporting the issue and providing the code to reproduce it. We have pushed a build of the rc2 version and this should be resolved now. We also encourage the use of _ids when you have existing records, since this gem is not exactly like ActiveRecord. Thanks again!