jina-ai / reader

Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/
https://jina.ai/reader
Apache License 2.0
7.02k stars 554 forks source link

Cannot post to s.jina.ai/search #110

Closed cly closed 2 months ago

cly commented 2 months ago

The docs (https://s.jina.ai/docs) don't seem explain where one can pass in the "search query" in a POST call.

I tried encoding it in the "url" parameter but it didn't work.

Furthermore, the openapi docs generated from the /docs page only has a /search with post interface. In the homepage, there is a GET to / that seems to also work. this is the typescript interface generated:

/**
 * This file was auto-generated by openapi-typescript.
 * Do not make direct changes to the file.
 */

export interface paths {
    "/search": {
        parameters: {
            query?: never;
            header?: never;
            path?: never;
            cookie?: never;
        };
        get?: never;
        put?: never;
        /** search */
        post: {
            parameters: {
                query?: {
                    count?: number;
                    url?: string;
                    html?: string;
                    respondWith?: string;
                    withGeneratedAlt?: boolean;
                    withLinksSummary?: boolean;
                    withImagesSummary?: boolean;
                    noCache?: boolean;
                    cacheTolerance?: number;
                    targetSelector?: string;
                    waitForSelector?: string;
                    removeSelector?: string;
                    keepImgDataUrl?: boolean;
                    withIframe?: boolean;
                    setCookies?: string;
                    proxyUrl?: string;
                    userAgent?: string;
                    timeout?: number;
                    ext?: string;
                    filetype?: string;
                    inbody?: string;
                    intitle?: string;
                    inpage?: string;
                    lang?: string;
                    loc?: string;
                    site?: string;
                };
                header?: {
                    /** @description Jina Token for authentication.
                     *
                     *     - Member of <JinaEmbeddingsAuthDTO>
                     *
                     *     - Authorization: Bearer {YOUR_JINA_TOKEN} */
                    Authorization?: string;
                    /** @description Specifies your preference for the response format.
                     *
                     *     Supported formats:
                     *     - text/event-stream
                     *     - application/json or text/json
                     *     - text/plain */
                    Accept?: string;
                    /** @description Sets internal cache tolerance in seconds if this header is specified with a integer. */
                    "X-Cache-Tolerance"?: string;
                    /** @description Ignores internal cache if this header is specified with a value.
                     *
                     *     Equivalent to X-Cache-Tolerance: 0 */
                    "X-No-Cache"?: string;
                    /** @description Specifies the (non-default) form factor of the crawled data you prefer.
                     *
                     *     Supported formats:
                     *     - markdown
                     *     - html
                     *     - text
                     *     - pageshot
                     *     - screenshot
                     *      */
                    "X-Respond-With"?: string;
                    /** @description Specifies a CSS selector to wait for the appearance of such an element before returning.
                     *
                     *     Example: `X-Wait-For-Selector: .content-block`
                     *      */
                    "X-Wait-For-Selector"?: string;
                    /** @description Specifies a CSS selector for return target instead of the full html.
                     *
                     *     Implies `X-Wait-For-Selector: (same selector)` */
                    "X-Target-Selector"?: string;
                    /** @description Specifies a CSS selector to remove elements from the full html.
                     *
                     *     Example `X-Remove-Selector: nav` */
                    "X-Remove-Selector"?: string;
                    /** @description Keep data-url as it instead of transforming them to object-url. (Only applicable when targeting markdown format)
                     *
                     *     Example `X-Keep-Img-Data-Url: true` */
                    "X-Keep-Img-Data-Url"?: string;
                    /** @description Specifies your custom proxy if you prefer to use one.
                     *
                     *     Supported protocols:
                     *     - http
                     *     - https
                     *     - socks4
                     *     - socks5
                     *
                     *     For authentication, https://user:pass@host:port */
                    "X-Proxy-Url"?: string;
                    /** @description Sets cookie(s) to the headless browser for your request.
                     *
                     *     Syntax is the same with standard Set-Cookie */
                    "X-Set-Cookie"?: string;
                    /** @description Enable automatic alt-text generating for images without an meaningful alt-text.
                     *
                     *     Note: Does not work when `X-Respond-With` is specified */
                    "X-With-Generated-Alt"?: string;
                    /** @description Enable dedicated summary section for images on the page. */
                    "X-With-Images-Summary"?: string;
                    /** @description Enable dedicated summary section for hyper links on the page. */
                    "X-With-links-Summary"?: string;
                    /** @description Override User-Agent. */
                    "X-User-Agent"?: string;
                    /** @description Specify timeout in seconds. Max 180. */
                    "X-Timeout"?: string;
                };
                path?: never;
                cookie?: never;
            };
            requestBody?: {
                content: {
                    "application/json": components["schemas"]["JinaEmbeddingsAuthDTO@0x09"] & components["schemas"]["CrawlerOptions@0x0a"] & components["schemas"]["BraveSearchExplicitOperatorsDto@0x0b"] & {
                        /**
                         * @description - Cast to: < **number** >
                         *
                         *     - Endpoint parameter of *search*
                         *
                         *     - Some validators will be applied on value(s): (v) => v >= 3 && v <= 10
                         *
                         *     - Defaults to: 5
                         * @default 5
                         */
                        count?: number;
                    };
                    "multipart/form-data": components["schemas"]["JinaEmbeddingsAuthDTO@0x09"] & components["schemas"]["CrawlerOptions@0x0a"] & components["schemas"]["BraveSearchExplicitOperatorsDto@0x0b"] & {
                        /**
                         * @description - Cast to: < **number** >
                         *
                         *     - Endpoint parameter of *search*
                         *
                         *     - Some validators will be applied on value(s): (v) => v >= 3 && v <= 10
                         *
                         *     - Defaults to: 5
                         * @default 5
                         */
                        count?: number;
                    };
                    "application/x-www-form-urlencoded": {
                        /**
                         * @description - Cast to: < **number** >
                         *
                         *     - Endpoint parameter of *search*
                         *
                         *     - Some validators will be applied on value(s): (v) => v >= 3 && v <= 10
                         *
                         *     - Defaults to: 5
                         * @default 5
                         */
                        count?: number;
                        /** @description - Cast to: < **string** >
                         *
                         *     - Member of <CrawlerOptions> */
                        url?: string;
                        /** @description - Cast to: < **string** >
                         *
                         *     - Member of <CrawlerOptions> */
                        html?: string;
                        /**
                         * @description - Cast to: < **string** >
                         *
                         *     - Member of <CrawlerOptions>
                         *
                         *     - Defaults to: 'default'
                         * @default default
                         */
                        respondWith?: string;
                        /** @description - Cast to: < **boolean** >
                         *
                         *     - Member of <CrawlerOptions> */
                        withGeneratedAlt?: boolean;
                        /** @description - Cast to: < **boolean** >
                         *
                         *     - Member of <CrawlerOptions> */
                        withLinksSummary?: boolean;
                        /** @description - Cast to: < **boolean** >
                         *
                         *     - Member of <CrawlerOptions> */
                        withImagesSummary?: boolean;
                        /** @description - Cast to: < **boolean** >
                         *
                         *     - Member of <CrawlerOptions> */
                        noCache?: boolean;
                        /** @description - Cast to: < **number** >
                         *
                         *     - Member of <CrawlerOptions> */
                        cacheTolerance?: number;
                        /** @description - Cast to: < **Array<string>** >
                         *
                         *     - Member of <CrawlerOptions> */
                        targetSelector?: string;
                        /** @description - Cast to: < **Array<string>** >
                         *
                         *     - Member of <CrawlerOptions> */
                        waitForSelector?: string;
                        /** @description - Cast to: < **Array<string>** >
                         *
                         *     - Member of <CrawlerOptions> */
                        removeSelector?: string;
                        /** @description - Cast to: < **boolean** >
                         *
                         *     - Member of <CrawlerOptions> */
                        keepImgDataUrl?: boolean;
                        /** @description - Cast to: < **boolean** >
                         *
                         *     - Member of <CrawlerOptions> */
                        withIframe?: boolean;
                        /** @description - Cast to: < **Array<string>** >
                         *
                         *     - Member of <CrawlerOptions> */
                        setCookies?: string;
                        /** @description - Cast to: < **string** >
                         *
                         *     - Member of <CrawlerOptions> */
                        proxyUrl?: string;
                        /** @description - Cast to: < **string** >
                         *
                         *     - Member of <CrawlerOptions> */
                        userAgent?: string;
                        /** @description - Cast to: < **number** >
                         *
                         *     - Member of <CrawlerOptions>
                         *
                         *     - Some validators will be applied on value(s): (v) => v > 0 && v <= 180 */
                        timeout?: number;
                        /** @description Returns web pages with a specific file extension. Example: to find the Honda GX120 Owner’s manual in PDF, type “Honda GX120 ownners manual ext:pdf”.
                         *
                         *     - Cast to: < **Array<string>** >
                         *
                         *     - Member of <BraveSearchExplicitOperatorsDto> */
                        ext?: string;
                        /** @description Returns web pages created in the specified file type. Example: to find a web page created in PDF format about the evaluation of age-related cognitive changes, type “evaluation of age cognitive changes filetype:pdf”.
                         *
                         *     - Cast to: < **Array<string>** >
                         *
                         *     - Member of <BraveSearchExplicitOperatorsDto> */
                        filetype?: string;
                        /** @description Returns web pages containing the specified term in the body of the page. Example: to find information about the Nvidia GeForce GTX 1080 Ti, making sure the page contains the keywords “founders edition” in the body, type “nvidia 1080 ti inbody:“founders edition””.
                         *
                         *     - Cast to: < **Array<string>** >
                         *
                         *     - Member of <BraveSearchExplicitOperatorsDto> */
                        inbody?: string;
                        /** @description Returns webpages containing the specified term in the title of the page. Example: to find pages about SEO conferences making sure the results contain 2023 in the title, type “seo conference intitle:2023”.
                         *
                         *     - Cast to: < **Array<string>** >
                         *
                         *     - Member of <BraveSearchExplicitOperatorsDto> */
                        intitle?: string;
                        /** @description Returns webpages containing the specified term either in the title or in the body of the page. Example: to find pages about the 2024 Oscars containing the keywords “best costume design” in the page, type “oscars 2024 inpage:“best costume design””.
                         *
                         *     - Cast to: < **Array<string>** >
                         *
                         *     - Member of <BraveSearchExplicitOperatorsDto> */
                        inpage?: string;
                        /** @description Returns web pages written in the specified language. The language code must be in the ISO 639-1 two-letter code format. Example: to find information on visas only in Spanish, type “visas lang:es”.
                         *
                         *     - Cast to: < **Array<string>** >
                         *
                         *     - Member of <BraveSearchExplicitOperatorsDto> */
                        lang?: string;
                        /** @description Returns web pages written in the specified language. The language code must be in the ISO 639-1 two-letter code format. Example: to find information on visas only in Spanish, type “visas lang:es”.
                         *
                         *     - Cast to: < **Array<string>** >
                         *
                         *     - Member of <BraveSearchExplicitOperatorsDto> */
                        loc?: string;
                        /** @description Returns web pages coming only from a specific web site. Example: to find information about Goggles only on Brave pages, type “goggles site:brave.com”.
                         *
                         *     - Cast to: < **Array<string>** >
                         *
                         *     - Member of <BraveSearchExplicitOperatorsDto> */
                        site?: string;
                    };
                };
            };
            responses: {
                /** @description OK */
                200: {
                    headers: {
                        [name: string]: unknown;
                    };
                    content: {
                        "application/json": {
                            /**
                             * @description [REQUIRED]
                             *
                             *     Envelope code.
                             *
                             *     Mirror of HTTP status code
                             *
                             *     - Cast to: < **number** >
                             *
                             *     - Member of <IntegrityEnvelope>
                             *
                             *     - Defaults to: 200
                             * @default 200
                             */
                            code: number;
                            /**
                             * @description [REQUIRED]
                             *
                             *     Envelope status.
                             *
                             *     In extension to HTTP status code
                             *
                             *     - Cast to: < **number** >
                             *
                             *     - Member of <IntegrityEnvelope>
                             *
                             *     - Defaults to: 20000
                             * @default 20000
                             */
                            status: number;
                            /** @description The result payload you expect
                             *
                             *     - Cast to: < **string** >
                             *
                             *     - Member of <IntegrityEnvelope> */
                            data?: string;
                            /** @description The metadata that the payload sometimes came with
                             *
                             *     - Cast to: < **object** >
                             *
                             *     - Member of <IntegrityEnvelope> */
                            meta?: unknown;
                        };
                        "text/event-stream": components["schemas"]["OutputServerEventStream@0x0c"];
                    };
                };
            };
        };
        delete?: never;
        options?: never;
        head?: never;
        patch?: never;
        trace?: never;
    };
}
export type webhooks = Record<string, never>;
export interface components {
    schemas: {
        /** JinaEmbeddingsAuthDTO */
        "JinaEmbeddingsAuthDTO@0x09": unknown;
        /** CrawlerOptions */
        "CrawlerOptions@0x0a": {
            /** @description - Cast to: &lt; **string** &gt;
             *
             *     - Member of &lt;CrawlerOptions&gt; */
            url?: string;
            /** @description - Cast to: &lt; **string** &gt;
             *
             *     - Member of &lt;CrawlerOptions&gt; */
            html?: string;
            /**
             * @description - Cast to: &lt; **string** &gt;
             *
             *     - Member of &lt;CrawlerOptions&gt;
             *
             *     - Defaults to: &#39;default&#39;
             * @default default
             */
            respondWith: string;
            /** @description - Cast to: &lt; **boolean** &gt;
             *
             *     - Member of &lt;CrawlerOptions&gt; */
            withGeneratedAlt?: boolean;
            /** @description - Cast to: &lt; **boolean** &gt;
             *
             *     - Member of &lt;CrawlerOptions&gt; */
            withLinksSummary?: boolean;
            /** @description - Cast to: &lt; **boolean** &gt;
             *
             *     - Member of &lt;CrawlerOptions&gt; */
            withImagesSummary?: boolean;
            /** @description - Cast to: &lt; **boolean** &gt;
             *
             *     - Member of &lt;CrawlerOptions&gt; */
            noCache?: boolean;
            /** @description - Cast to: &lt; **number** &gt;
             *
             *     - Member of &lt;CrawlerOptions&gt; */
            cacheTolerance?: number;
            /** @description - Cast to: &lt; **Array&lt;string&gt;** &gt;
             *
             *     - Member of &lt;CrawlerOptions&gt; */
            targetSelector?: string[];
            /** @description - Cast to: &lt; **Array&lt;string&gt;** &gt;
             *
             *     - Member of &lt;CrawlerOptions&gt; */
            waitForSelector?: string[];
            /** @description - Cast to: &lt; **Array&lt;string&gt;** &gt;
             *
             *     - Member of &lt;CrawlerOptions&gt; */
            removeSelector?: string[];
            /** @description - Cast to: &lt; **boolean** &gt;
             *
             *     - Member of &lt;CrawlerOptions&gt; */
            keepImgDataUrl?: boolean;
            /** @description - Cast to: &lt; **boolean** &gt;
             *
             *     - Member of &lt;CrawlerOptions&gt; */
            withIframe?: boolean;
            /** @description - Cast to: &lt; **Array&lt;string&gt;** &gt;
             *
             *     - Member of &lt;CrawlerOptions&gt; */
            setCookies?: string[];
            /** @description - Cast to: &lt; **string** &gt;
             *
             *     - Member of &lt;CrawlerOptions&gt; */
            proxyUrl?: string;
            /** @description - Cast to: &lt; **string** &gt;
             *
             *     - Member of &lt;CrawlerOptions&gt; */
            userAgent?: string;
            /** @description - Cast to: &lt; **number** &gt;
             *
             *     - Member of &lt;CrawlerOptions&gt;
             *
             *     - Some validators will be applied on value(s): (v) =&gt; v &gt; 0 &amp;&amp; v &lt;= 180 */
            timeout?: number;
        };
        /** BraveSearchExplicitOperatorsDto */
        "BraveSearchExplicitOperatorsDto@0x0b": {
            /** @description Returns web pages with a specific file extension. Example: to find the Honda GX120 Owner’s manual in PDF, type “Honda GX120 ownners manual ext:pdf”.
             *
             *     - Cast to: &lt; **Array&lt;string&gt;** &gt;
             *
             *     - Member of &lt;BraveSearchExplicitOperatorsDto&gt; */
            ext?: string[];
            /** @description Returns web pages created in the specified file type. Example: to find a web page created in PDF format about the evaluation of age-related cognitive changes, type “evaluation of age cognitive changes filetype:pdf”.
             *
             *     - Cast to: &lt; **Array&lt;string&gt;** &gt;
             *
             *     - Member of &lt;BraveSearchExplicitOperatorsDto&gt; */
            filetype?: string[];
            /** @description Returns web pages containing the specified term in the body of the page. Example: to find information about the Nvidia GeForce GTX 1080 Ti, making sure the page contains the keywords “founders edition” in the body, type “nvidia 1080 ti inbody:“founders edition””.
             *
             *     - Cast to: &lt; **Array&lt;string&gt;** &gt;
             *
             *     - Member of &lt;BraveSearchExplicitOperatorsDto&gt; */
            inbody?: string[];
            /** @description Returns webpages containing the specified term in the title of the page. Example: to find pages about SEO conferences making sure the results contain 2023 in the title, type “seo conference intitle:2023”.
             *
             *     - Cast to: &lt; **Array&lt;string&gt;** &gt;
             *
             *     - Member of &lt;BraveSearchExplicitOperatorsDto&gt; */
            intitle?: string[];
            /** @description Returns webpages containing the specified term either in the title or in the body of the page. Example: to find pages about the 2024 Oscars containing the keywords “best costume design” in the page, type “oscars 2024 inpage:“best costume design””.
             *
             *     - Cast to: &lt; **Array&lt;string&gt;** &gt;
             *
             *     - Member of &lt;BraveSearchExplicitOperatorsDto&gt; */
            inpage?: string[];
            /** @description Returns web pages written in the specified language. The language code must be in the ISO 639-1 two-letter code format. Example: to find information on visas only in Spanish, type “visas lang:es”.
             *
             *     - Cast to: &lt; **Array&lt;string&gt;** &gt;
             *
             *     - Member of &lt;BraveSearchExplicitOperatorsDto&gt; */
            lang?: string[];
            /** @description Returns web pages written in the specified language. The language code must be in the ISO 639-1 two-letter code format. Example: to find information on visas only in Spanish, type “visas lang:es”.
             *
             *     - Cast to: &lt; **Array&lt;string&gt;** &gt;
             *
             *     - Member of &lt;BraveSearchExplicitOperatorsDto&gt; */
            loc?: string[];
            /** @description Returns web pages coming only from a specific web site. Example: to find information about Goggles only on Brave pages, type “goggles site:brave.com”.
             *
             *     - Cast to: &lt; **Array&lt;string&gt;** &gt;
             *
             *     - Member of &lt;BraveSearchExplicitOperatorsDto&gt; */
            site?: string[];
        };
        /**
         * Format: binary
         * @description Binary data (stream)
         */
        "OutputServerEventStream@0x0c": string;
    };
    responses: never;
    parameters: never;
    requestBodies: never;
    headers: never;
    pathItems: never;
}
export type $defs = Record<string, never>;
export type operations = Record<string, never>;
mapleeit commented 2 months ago

Hi @cly

It doesn't support passing search query in POST right now. Can you elaborate on your use case (why must POST)?

mapleeit commented 2 months ago

This is now supported.

curl -v -X POST https://s.jina.ai \
  -H 'Content-Type: application/json' \
  -d '{"q": "how is the weather today?"}'