airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
15.59k stars 4.02k forks source link

Source Okta endlessly loops when syncing `logs` stream #4447

Closed sherifnada closed 3 years ago

sherifnada commented 3 years ago

Current Behavior

Okta endlessly loops when syncing the logs stream

Expected Behavior

The connector does not loop endlessly when running

Logs

https://airbytehq.slack.com/archives/C01MFR03D5W/p1625077614268000 mentioned here

LOG ``` replace this with your long log output here ```

Steps to Reproduce

akoshterek commented 3 years ago

logs-361-0.txt

curl -v -X GET -H "Accept: application/json" -H "Content-Type: application/json" -H "Authorization: SSWS $TOKEN" "https://$DOMAIN/api/v1/logs?since=2021-06-30T18:00:00.000Z"
Note: Unnecessary use of -X or --request, GET is already inferred.
*   Trying 99.80.88.151...
* TCP_NODELAY set
* Connected to everon-space.okta.com (99.80.88.151) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: C=US; ST=California; L=San Francisco; O=Okta, Inc.; CN=*.okta.com
*  start date: Apr  1 00:00:00 2021 GMT
*  expire date: May  2 23:59:59 2022 GMT
*  subjectAltName: host "everon-space.okta.com" matched cert's "*.okta.com"
*  issuer: C=US; O=DigiCert Inc; CN=DigiCert TLS RSA SHA256 2020 CA1
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x7fc5b180d600)
> GET /api/v1/logs?since=2021-06-30T18:00:00.000Z HTTP/2
> Host: everon-space.okta.com
> User-Agent: curl/7.64.1
> Accept: application/json
> Content-Type: application/json
> Authorization: SSWS $TOKEN
> 
* Connection state changed (MAX_CONCURRENT_STREAMS == 128)!
< HTTP/2 200 
< date: Wed, 30 Jun 2021 18:16:58 GMT
< content-type: application/json
< content-length: 2
< server: nginx
< public-key-pins-report-only: pin-sha256="r5EfzZxQVvQpKo3AgYRaT7X2bDO/kj3ACwmxfdT2zt8="; pin-sha256="MaqlcUgk2mvY/RFSGeSwBRkI+rZ6/dxe/DuQfBT/vnQ="; pin-sha256="72G5IEvDEWn+EThf3qjR7/bQSWaS2ZSLqolhnO6iyJI="; pin-sha256="rrV6CLCCvqnk89gWibYT0JO6fNQ8cCit7GGoiVTjCOg="; max-age=60; report-uri="https://okta.report-uri.com/r/default/hpkp/reportOnly"
< x-okta-request-id: YNy1GpUqhQIZNn@xPyDf5AAAB-4
< x-xss-protection: 0
< p3p: CP="HONK"
< x-rate-limit-limit: 120
< x-rate-limit-remaining: 119
< x-rate-limit-reset: 1625077078
< cache-control: no-cache, no-store
< pragma: no-cache
< expires: 0
< content-security-policy: default-src 'self' everon-space.okta.com *.oktacdn.com; connect-src 'self' everon-space.okta.com everon-space-admin.okta.com *.oktacdn.com *.mixpanel.com *.mapbox.com app.pendo.io data.pendo.io pendo-static-5634101834153984.storage.googleapis.com everon-space.kerberos.okta.com https://oinmanager.okta.com data:; script-src 'unsafe-inline' 'unsafe-eval' 'self' everon-space.okta.com *.oktacdn.com; style-src 'unsafe-inline' 'self' everon-space.okta.com *.oktacdn.com app.pendo.io cdn.pendo.io pendo-static-5634101834153984.storage.googleapis.com; frame-src 'self' everon-space.okta.com everon-space-admin.okta.com login.okta.com; img-src 'self' everon-space.okta.com *.oktacdn.com *.tiles.mapbox.com *.mapbox.com app.pendo.io data.pendo.io cdn.pendo.io pendo-static-5634101834153984.storage.googleapis.com data: blob:; font-src 'self' everon-space.okta.com data: *.oktacdn.com fonts.gstatic.com; report-uri https://okta.report-uri.com/r/d/csp/enforce; report-to csp-enforce
< report-to: {"group":"csp-enforce","max_age":31536000,"endpoints":[{"url":"https://okta.report-uri.com/r/d/csp/enforce"}],"include_subdomains":true}
< expect-ct: report-uri="https://oktaexpectct.report-uri.com/r/t/ct/reportOnly", max-age=0
< link: <https://$DOMAIN/api/v1/logs?since=2021-06-30T18%3A00%3A00.000Z>; rel="self"
< link: <https://$DOMAIN/api/v1/logs?since=2021-06-30T18%3A00%3A00.000Z>; rel="next"
< x-content-type-options: nosniff
< strict-transport-security: max-age=315360000; includeSubDomains
< set-cookie: sid=""; Expires=Thu, 01-Jan-1970 00:00:10 GMT; Path=/
< set-cookie: JSESSIONID=79348F8FC961548632FCD4AE63D0D4FE; Path=/; Secure; HttpOnly
< 
* Connection #0 to host $DOMAIN left intact
[]* Closing connection 0

The last page of the logs stream still has next link, matching self link. Currently this situation seems to be not recognized properly and OKTA source keep endlessly following next links

sherifnada commented 3 years ago

fixed in #4456