getindata / flink-http-connector

Http Connector for Apache Flink. Provides sources and sinks for Datastream , Table and SQL APIs.
Apache License 2.0
150 stars 43 forks source link

Could you provide an example of Http Scan Source (not lookup source) ? #41

Open ChenShuai1981 opened 1 year ago

ChenShuai1981 commented 1 year ago

Could you provide an example of Http Periodically Scan Source (not lookup source)? Does it support renew access token after expiration?

kristoffSC commented 1 year ago

Hi @ChenShuai1981 Scan source is currently not supported by this connector, hence no example available :) for now we have only lookup source. Although this would be a great feature, would you like to contribute? :)

The proper Flink interfaces would have to be implanted.

This feature would be a nice one though, however it would be very "client specific".

davidradl commented 3 months ago

@ChenShuai1981 the lookup support that exists currently ends up issuing gets, puts or posts on single records. For the scan to work, I suspect we would need to issue searches, and get involved with paging the results. This could really impact performance of a scan, as we could end up effective doing table scans, unless we could do predicate pushdown.

ChenShuai1981 commented 3 months ago

@ChenShuai1981 the lookup support that exists currently ends up issuing gets, puts or posts on single records. For the scan to work, I suspect we would need to issue searches, and get involved with paging the results. This could really impact performance of a scan, as we could end up effective doing table scans, unless we could do predicate pushdown.

Yes, you are right. Since content provider will update information irregularly so we have to periodly send get/post request to fetch them and sync into our database. Scenario like network crawler and system integration. Generally speaking if the results is too large the provider will return a streaming response.