[Bug]: 每次请求最大文本长度, 每次请求最大段落数似乎无效

jiangying000 commented 7 months ago

插件版本号 | Version

1.2.4

平台 | Platform

Windows

浏览器 | Browsers

Chrome

插件类型 | Extension Type

浏览器插件 | Browser Extension

请描述这个Bug | Describe the bug

我设置了：

每次请求最大文本长度：2000 每次请求最大段落数：9

但是有时候，请求里面只发送了一点内容，造成大量token浪费

比如：翻译https://openai.com/pricing

Pricing Show prices per 1K tokens 上面这两种极短的文本都单独发了一个请求，导致200个token是指令，只有<10个token是真实要翻译的文字

出现问题的网址 | URL

https://openai.com/pricing

重现步骤 | To Reproduce

翻译https://openai.com/pricing

补充说明 | Additional context

No response

cssmagic commented 7 months ago

我也是用户。关注。

有没有可能是 “拼车” 失败，只能单发一趟？

cssmagic commented 7 months ago

我观察了一下插件在 openai.com/pricing 这个页面的请求内容，记录如下（*** 是我自定义的分隔符）：

1.

    Show prices per 1K tokens

2.

    Language models
    ***
    Multiple models, each with different capabilities and price points. Prices can be viewed in units of either per 1M or 1K tokens. You can think of tokens as pieces of words, where 1,000 tokens is about 750 words. This paragraph is 35 tokens. 
    ***
    GPT-4 Turbo
    ***
    With 128k context, fresher knowledge and the broadest set of capabilities, GPT-4 Turbo is more powerful than GPT-4 and offered at a lower price. 
    ***
    Learn about GPT-4 Turbo

3.

    Model
    ***
    Input
    ***
    Output
    ***
    gpt-4-0125-preview
    ***
    $10.00 / 1M tokens

4.

    $30.00 / 1M tokens
    ***
    gpt-4-1106-preview
    ***
    $10.00 / 1M tokens
    ***
    $30.00 / 1M tokens
    ***
    gpt-4-1106-vision-preview

5.

    $10.00 / 1M tokens
    ***
    $30.00 / 1M tokens
    ***
    Vision pricing calculator
    ***
    Low resolution
    ***
    GPT-4

6.

    With broad general knowledge and domain expertise, GPT-4 can follow complex instructions in natural language and solve difficult problems with accuracy.
    ***
    Learn about GPT-4
    ***
    Model
    ***
    Input
    ***
    Output

7.

    $30.00 / 1M tokens
    ***
    $60.00 / 1M tokens
    ***
    gpt-4-32k
    ***
    $60.00 / 1M tokens
    ***
    $120.00 / 1M tokens

8.

    GPT-3.5 Turbo
    ***
    GPT-3.5 Turbo models are capable and cost-effective.
    ***
    Pricing
    ***
    Simple and flexible. Only pay for what you use.
    ***
    Contact sales

9.

    <b0></b0> is the flagship model of this family, supports a 16K context window and is optimized for dialog.

10.

    <b0></b0> is an Instruct model and only supports a 4K context window.

11.

    Learn about GPT-3.5 Turbo

12.

    $0.50 / 1M tokens

13.

    $1.50 / 1M tokens
    ***
    gpt-3.5-turbo-0125
    ***
    $1.50 / 1M tokens
    ***
    $2.00 / 1M tokens
    ***
    gpt-3.5-turbo-instruct

14.

    Assistants API

15.

    Assistants API and tools (retrieval, code interpreter) make it easy for developers to build AI assistants within their own applications. Each assistant incurs its own retrieval file storage fee based on the files passed to that assistant. The retrieval tool chunks and indexes your files content in our vector database.

cssmagic commented 7 months ago

简单分析一下：

页头似乎直接被忽略了。可能是插件的优化策略。
第 1 个请求是 Show prices per 1K tokens，但它不是页面中最顶部的元素。——这个没想明白。（请教 @theowenyoung ）
除了页头以外，页面中最顶部的文字 Pricing、 Simple and flexible. Only pay for what you use. 和 Contact sales 在第 8 个请求才发出。——不知道是不是插件根据内容的优先级做了调度。（请教 @theowenyoung ）
第 9、10、11、14、15 个请求都是只有一段。这可能是因为页面滚动时，这些元素依次时入视口，所以单独请求是正常的。
第 12 个请求单独发车，可能是没法跟另外 5 个（第 13 个请求）拼在一起。

cssmagic commented 7 months ago

@jiangying000 总的看下来，似乎没有特别异常的情况。如果可以的话，你也可以详细记录一下。

jiangying000 commented 7 months ago

我观察了一下插件在 openai.com/pricing 这个页面的请求内容，记录如下（*** 是我自定义的分隔符）：

```
Show prices per 1K tokens
```

Language models
***
Multiple models, each with different capabilities and price points. Prices can be viewed in units of either per 1M or 1K tokens. You can think of tokens as pieces of words, where 1,000 tokens is about 750 words. This paragraph is 35 tokens. 
***
GPT-4 Turbo
***
With 128k context, fresher knowledge and the broadest set of capabilities, GPT-4 Turbo is more powerful than GPT-4 and offered at a lower price. 
***
Learn about GPT-4 Turbo

Model
***
Input
***
Output
***
gpt-4-0125-preview
***
$10.00 / 1M tokens

$30.00 / 1M tokens
***
gpt-4-1106-preview
***
$10.00 / 1M tokens
***
$30.00 / 1M tokens
***
gpt-4-1106-vision-preview

$10.00 / 1M tokens
***
$30.00 / 1M tokens
***
Vision pricing calculator
***
Low resolution
***
GPT-4

With broad general knowledge and domain expertise, GPT-4 can follow complex instructions in natural language and solve difficult problems with accuracy.
***
Learn about GPT-4
***
Model
***
Input
***
Output

$30.00 / 1M tokens
***
$60.00 / 1M tokens
***
gpt-4-32k
***
$60.00 / 1M tokens
***
$120.00 / 1M tokens

GPT-3.5 Turbo
***
GPT-3.5 Turbo models are capable and cost-effective.
***
Pricing
***
Simple and flexible. Only pay for what you use.
***
Contact sales

<b0></b0> is the flagship model of this family, supports a 16K context window and is optimized for dialog.

<b0></b0> is an Instruct model and only supports a 4K context window.

```
Learn about GPT-3.5 Turbo
```
```
$0.50 / 1M tokens
```

$1.50 / 1M tokens
***
gpt-3.5-turbo-0125
***
$1.50 / 1M tokens
***
$2.00 / 1M tokens
***
gpt-3.5-turbo-instruct

```
Assistants API
```

Assistants API and tools (retrieval, code interpreter) make it easy for developers to build AI assistants within their own applications. Each assistant incurs its own retrieval file storage fee based on the files passed to that assistant. The retrieval tool chunks and indexes your files content in our vector database.

这里 1, 9, 10, 11, 12, 14, 15 都是一个短句子，但是都独立占了一个请求啊

我的理解是，如果我设置了：

每次请求最大文本长度：2000 每次请求最大段落数：9

那么上面得15个请求，实际大概有40个段落，在5次请求可以处理完，而不是这里的15个，明显偏多了，没有合并小的句子

cssmagic commented 7 months ago

我们代入场景来想像一下：

拼车是有时间窗口的。如果没有人拼车，时间到了也得发车。如果页面上已经显示英文，哪怕只有一行，也得调用接口来翻译了，不能一直空着等人来拼车。

theowenyoung commented 7 months ago

分组可能会有不同的影响元素，一个是 @cssmagic 提到的，我们每 xx 秒，发现队列里有句子，那么一定会发送一次请求。

另外就是分组是强制要求原文句子语言一样的才会在一组，但是插件本地检测语言的时候，有的时候无法判断句子的语言（比如句子很短的情况），这类句子会单独作为一组。

theowenyoung commented 7 months ago

页头似乎直接被忽略了。可能是插件的优化策略。

是的，默认只翻译主要内容。

第 1 个请求是 Show prices per 1K tokens，但它不是页面中最顶部的元素。——这个没想明白。（请教 @theowenyoung ）

是的，对于 openai 来说，由于速度较慢，插件会优先翻译正文部分。

除了页头以外，页面中最顶部的文字 Pricing、 Simple and flexible. Only pay for what you use. 和 Contact sales 在第 8 个请求才发出。——不知道是不是插件根据内容的优先级做了调度。（请教 @theowenyoung ）

是的，对于 openai 来说，由于速度较慢，插件会优先翻译正文部分。

第 9、10、11、14、15 个请求都是只有一段。这可能是因为页面滚动时，这些元素依次时入视口，所以单独请求是正常的。

是的，这个是插件的节流策略，只有队列里有句子，每xxx秒一定会发出这个请求。

jiangying000 commented 6 months ago

我大概理解了，原因是：

每 xx 秒，发现队列里有句子，那么一定会发送一次请求。
本地检测语言的时候，有的时候无法判断句子的语言（比如句子很短的情况），这类句子会单独作为一组。

（对于2，因为我的prompt里面没有sourceLanguage设置，直接是“请翻译成中文”，所以检测语音机制可能可以优化到到不影响发送）

jiangying000 commented 6 months ago

我的问题解决了，谢谢二位

immersive-translate / immersive-translate