estuary / flow

🌊 Continuously synchronize the systems where your data lives, to the systems where you _want_ it to live, with Estuary Flow. 🌊
https://estuary.dev
Other
595 stars 47 forks source link

fix: `flowctl catalog` does need pagination after all #1626

Closed jshearer closed 2 weeks ago

jshearer commented 2 weeks ago

Description: I was having a weird issue where flowctl catalog pull-specs --captures=true --collections=true wasn't returning any captures, but flowctl catalog pull-specs --captures=true --collections=false was, and realized that the first command was returning 1001 specs, which is suspiciously close to PostgREST's max row limit. Turns out that even though we're paginating the inputs, we still need to paginate the queries themselves.

Existing flowctl

$ flowctl catalog pull-specs --captures=true --collections=true --prefix <...>
Wrote 1001 specifications under file:///Users/js/Documents/estuary/automatic-backfills/test/flow.yaml.

Updated flowctl

$ /usr/local/bin/flowctl catalog pull-specs --captures=true --collections=true --prefix <...>
Wrote 1732 specifications under file:///Users/js/Documents/estuary/automatic-backfills/test/flow.yaml.

This change is Reviewable

jshearer commented 2 weeks ago

Updated to remove incorrect comment:

// No need for pagination because we're paginating the inputs.