quackscience / duckdb-extension-httpclient

DuckDB HTTP GET/POST Client in a Community Extension
MIT License
12 stars 1 forks source link

Catching HTTP errors #3

Open StephanGeorg opened 3 hours ago

StephanGeorg commented 3 hours ago

Hi @lmangani. Hi @ahuarte47.

First of all, thank you for this extension. Yes, I know, this is in very early stage but I've been dreaming of a feature like this for a long time because it enables and improves a lot of workflows and pipelines.

I'm not sure if you're open to feedback yet, but catching errors like this would be great:

Invalid Error: HTTP POST error: 404 - Not Found

Cheers.

lmangani commented 2 hours ago

Hey @StephanGeorg thanks for giving this an early try, we really appreciate it!

Could you document the change proposal? ie: example query, current Output, desired output (or error, etc)

StephanGeorg commented 57 minutes ago

I'm using a MACRO to do the requests

CREATE OR REPLACE MACRO call_endpoint(a) AS (
  SELECT
    http_post(
      'https://domain.lol/api/endpoint',
      headers => MAP {
        'accept': 'application/json',
      },
      params => a
    )
    AS data);

and then apply that to all rows of a table

SELECT call_endpoint(MAP { 
  'countryCode': country,
  'street': street,
  'houseNumber': house_number,
  'postCode': postcode,
  'locality': city
}) AS output_data
FROM '/path/to/input.csv';

For some rows, the endpoint returns HTTP status code 404 because the input does not match any result. Instead of throwing

Invalid Error: HTTP POST error: 404 - Not Found

for all results, it should catch errors where the request was unsuccessful and return results where it was successful. Or you wrap the result into an object that contains the HTTP status code and the payload.

ahuarte47 commented 34 minutes ago

Hello everyone, I love your idea of ​​changing the type of function results (From LogicalType::VARCHAR to LogicalType::JSON()). They could contain all the information returned by the request and the payload

ahuarte47 commented 16 minutes ago

Maybe returning some attributes from the Response object: { "body": "...", "status": 200, "reason": "..." } Instead of throwing an error if code is not 200. What do you think?