redpanda-data / connect

Fancy stream processing made operationally mundane
https://docs.redpanda.com/redpanda-connect/about/
8.08k stars 814 forks source link

Interpolating strings containing URL encoded double quotes #2754

Closed brknstrngz closed 13 hours ago

brknstrngz commented 1 month ago

Hello,

I am consuming a paginated API where the JSON response has a cursor property containing URL encoded double quotes - %22.

Logging the property string as-is (root.next) using the log processor displays it correctly: /endpoint?token=mytoken&q=%22first+second%22&sort=date.

Unfortunately, interpolating the same property as '${! json("next") }' yields /endpoint?token=mytoken&q=%!n(MISSING)irst+second%!&(MISSING)sort=date.

Am I missing the obvious? Thanks!

mihaitodor commented 1 month ago

Hey @brknstrngz 👋 Thanks for reporting this issue! I also bumped into it recently but I couldn't figure out how to reproduce it. I created a small config which should help narrow it down:

input:
  generate:
    count: 1
    mapping: |
      root.foo = "/endpoint?token=mytoken&q=%22first+second%22&sort=date"

  processors:
    - log:
        message: ${! json("foo") }

I think the issue is in the log processor which seems to be treating the log message as a format string. I'll do some digging to see if I can figure it out.

brknstrngz commented 1 month ago

Thank you @mihaitodor, I can confirm that it is indeed the log processor that misbehaves. Adding that property to a cache and retrieving it back leaves it intact, so not an interpolation problem.