espressif / esp-box

The ESP-BOX is a new generation AIoT development platform released by Espressif Systems.
Apache License 2.0
808 stars 189 forks source link

gpt-demo:getaddrinfo() returns 202, addrinfo=0x0,caused by incorrect DNS resolution (AEGHB-807) #164

Open lilian-lilifox opened 2 months ago

lilian-lilifox commented 2 months ago

Checklist

How often does this bug occurs?

always

Expected behavior

Screenshot_20240910_172339 I have set the base_url as: https://api.chatanywhere.tech, which tested valid on lobechat.com with the apikey.

Actual behavior (suspected bug)

It connected to wifi successfully, but when asking questions it gives the fallback of invalid_request_error, and i found the DNS incorrect. As the log described, dns_enqueue: "api.chatanywhere.techaudio": use DNS entry 0, dns_recv: "api.chatanywhere.techaudio": error in flags, an unexpected audio was added.

Error logs or terminal output

W (19126) wifi:exceed max band, 2g, ngroup:3
W (20355) wifi:exceed max band, 2g, ngroup:3
W (21584) wifi:exceed max band, 2g, ngroup:3
I (22506) app_sr: wakeword detected
I (22506) app_audio: ### record Start
I (22506) ui_ctrl: Swich to panel[1]
I (22530) app_audio: frame_rate= 16000, ch=2, width=16
I (22533) I2S_IF: Pending out channel for in channel running
E (22548) i2s_common: i2s_channel_disable(1121): the channel has not been enabled yet
I (22548) I2S_IF: channel mode 0 bits:16/16 channel:2 mask:3
I (22553) I2S_IF: STD Mode 1 bits:16/16 channel:2 sample_rate:16000 mask:3
I (22601) app_sr: AFE_FETCH_CHANNEL_VERIFIED, channel index: 2

I (22646) Adev_Codec: Open codec device OK
E (22646) i2s_common: i2s_channel_disable(1121): the channel has not been enabled yet
I (22648) I2S_IF: channel mode 0 bits:16/16 channel:2 mask:3
I (22654) I2S_IF: STD Mode 0 bits:16/16 channel:2 sample_rate:16000 mask:3
I (22666) ES7210: Bits 16
I (22674) ES7210: Enable ES7210_INPUT_MIC1
I (22675) ES7210: Enable ES7210_INPUT_MIC2
I (22678) ES7210: Unmuted
I (22678) Adev_Codec: Open codec device OK
W (22812) wifi:exceed max band, 2g, ngroup:3
W (22915) wifi:exceed max band, 2g, ngroup:3
W (24144) wifi:exceed max band, 2g, ngroup:3
W (25372) wifi:exceed max band, 2g, ngroup:3
I (26335) app_audio: ESP_MN_STATE_TIMEOUT
I (26335) app_audio: ### record Stop, 65472 63K
I (26378) OpenAI: OpenAI create, version: 1.0.0
I (26379) ui_ctrl: Swich to panel[2]
I (26378) app_audio: Player PLAYING
dns_enqueue: "api.chatanywhere.techaudio": use DNS entry 0
udp_bind(ipaddr = 0.0.0.0, port = 37872)
udp_bind: bound to 0.0.0.0, port 37872)
dns_enqueue: "api.chatanywhere.techaudio": use DNS pcb 0
dns_send: dns_servers[0] "api.chatanywhere.techaudio": request
sending DNS request ID 26982 for name "api.chatanywhere.techaudio" to server 0
udp_send
udp_send: added header in given pbuf 0x3c3f48c4
udp_send: sending datagram of length 52
udp_send: UDP packet length 52
udp_send: UDP checksum 0x8698
udp_send: ip_output_if (,,,,0x11,)
udp_input: received datagram of length 52
UDP header:
+-------------------------------+
|        53     |     37872     | (src port, dest port)
+-------------------------------+
|        52     |     0x1803    | (len, chksum)
+-------------------------------+
udp (192.168.230.84, 37872) <-- (192.168.230.207, 53)
pcb (0.0.0.0, 37872) <-- (0.0.0.0, 0)
I (26459) I2S_IF: Pending out channel for in channel running
pcb (0.0.0.0, 68) <-- (0.0.0.0, 67)
udp_input: calculating checksum
dns_recv: "api.chatanywhere.techaudio": error in flags
E (26477) esp-tls: couldn't get hostname for :api.chatanywhere.techaudio: getaddrinfo() returns 202, addrinfo=0x0
E (26488) esp-tls: Failed to open new connection
E (26493) transport_base: Failed to open a new connection
E (26499) HTTP_CLIENT: Connection failed, sock < 0
E (26506) OpenAI: ./managed_components/espressif__openai/OpenAI.c:2390 (OpenAI_Request):Failed to open client!
E (26513) i2s_common: i2s_channel_disable(1121): the channel has not been enabled yet
I (26524) I2S_IF: channel mode 0 bits:16/16 channel:2 mask:3
I (26531) I2S_IF: STD Mode 1 bits:16/16 channel:2 sample_rate:16000 mask:3
E (26521) OpenAI: ./managed_components/espressif__openai/OpenAI.c:2080 (OpenAI_AudioTranscriptionFile):Empty result!
I (26550) ui_ctrl: update listen speak
E (26555) app_main: start_openai(85): [audioTranscription]: invalid url
W (26601) wifi:exceed max band, 2g, ngroup:3
I (26644) Adev_Codec: Open codec device OK
E (26644) i2s_common: i2s_channel_disable(1121): the channel has not been enabled yet
I (26646) I2S_IF: channel mode 0 bits:16/16 channel:2 mask:3
I (26653) I2S_IF: STD Mode 0 bits:16/16 channel:2 sample_rate:16000 mask:3
I (26666) ES7210: Bits 16
I (26670) ES7210: Enable ES7210_INPUT_MIC1
I (26671) ES7210: Enable ES7210_INPUT_MIC2
I (26676) ES7210: Unmuted
I (26676) Adev_Codec: Open codec device OK
W (26704) wifi:exceed max band, 2g, ngroup:3
dns_tmr: dns_check_entries
W (27932) wifi:exceed max band, 2g, ngroup:3
W (29161) wifi:exceed max band, 2g, ngroup:3
I (30108) app_audio: Player IDLE
I (30111) I2S_IF: Pending out channel for in channel running
E (30138) i2s_common: i2s_channel_disable(1121): the channel has not been enabled yet
I (30138) I2S_IF: channel mode 0 bits:16/16 channel:2 mask:3
I (30143) I2S_IF: STD Mode 1 bits:16/16 channel:2 sample_rate:16000 mask:3
I (30241) Adev_Codec: Open codec device OK
E (30241) i2s_common: i2s_channel_disable(1121): the channel has not been enabled yet
I (30243) I2S_IF: channel mode 0 bits:16/16 channel:2 mask:3
I (30249) I2S_IF: STD Mode 0 bits:16/16 channel:2 sample_rate:16000 mask:3
I (30257) ES7210: Bits 16
I (30263) ES7210: Enable ES7210_INPUT_MIC1
I (30266) ES7210: Enable ES7210_INPUT_MIC2
I (30273) ES7210: Unmuted
I (30273) Adev_Codec: Open codec device OK
I (30278) app_main: replay audio end
W (30390) wifi:exceed max band, 2g, ngroup:3
W (31619) wifi:exceed max band, 2g, ngroup:3
W (32848) wifi:exceed max band, 2g, ngroup:3


### Steps to reproduce the behavior

#

### Project release version

lastest esp-idf, with esp-s3-box3

### System architecture

Intel/AMD 64-bit (modern PC, older Mac)

### Operating system

Linux

### Operating system version

Archlinux

### Shell

Fish

### Additional context

_No response_
lilian-lilifox commented 2 months ago

Screenshot_20240910_192603 Screenshot_20240910_194044

Before calling OpenAIChangeBaseURL() it seems to be correct:

Screenshot_20240910_193851

lilian-lilifox commented 2 months ago

Screenshot_20240910_212114

Its not here exactly...must be somewhere behind...

Screenshot_20240910_212435

lilian-lilifox commented 2 months ago

Screenshot_20240910_215542

The baseurl in OpenAI_t openai didn't go wrong. I think there might be some mistakes with functions dealing with dns requests.

lilian-lilifox commented 2 months ago

I founded that I should add a slash and the end of the url like this: https://api.chatanywhere.tech/. Then it will request https://api.chatanywhere.tech/audio/speech/, and dramatically my origin doesn't support this request.

Horion0415 commented 2 months ago

So, the example https://api.chatanywhere.tech/ requires a / to work, but your source does not support adding a /, correct? Is my understanding accurate? Do you have to use this source? Maybe you could try a different source, and I will investigate why this issue occurred on my end.

lilian-lilifox commented 2 months ago

@Horion0415 This chatgpt demo send our voice directly to openai. GPT deals with audio then response text. And the response text will be transverse to speech by tts. To use the audio service of openai, it requests https://baseurl/ (set by us)+ audio/speech/(implicit by openai api). Some of the third party source doesn't support the audio service. I think I will try a different source.

Horion0415 commented 2 months ago

Okay, I understand. In fact, I have found that there are often issues when using ChatAnywhere, and we will try to resolve this problem later. In the meantime, you can try using other sources for testing.

lilian-lilifox commented 2 months ago

For the region policy, I could not use the offical service. Another solution is,stt + gpt(text) + tts . We can use Speech to Text first, then request gpt in text. This is a more general way, however may lead to longer response time and lower accuracy.