acheong08 / ChatGPTProxy

Simple Cloudflare bypass for ChatGPT
The Unlicense
1.32k stars 326 forks source link

reduce message latency by introducing flushing operation #60

Closed wangjiyang closed 1 year ago

wangjiyang commented 1 year ago

Previous implementation uses gin.Stream to copy message to proxy client. This can introduce message sending cached in server side and latency happens. This causes proxy client receives a bunch of messages one time, makes proxy clients have worse experience than openai official chabot. Proxy clients become smoother after this commit.

wangjiyang commented 1 year ago

Check if this commit can improve user experience in streaming mode. Thanks

acheong08 commented 1 year ago

The speedup is minimal but it does make it slightly faster

acheong08 commented 1 year ago

Actually, the chunk sometimes makes it slower

acheong08 commented 1 year ago

gin.Stream streams it as it arrives while this sends things in chunks. Might be slightly better if it streams by line

wangjiyang commented 1 year ago

Infact gin.Stream has a internal buffer, which caches buffer reading from remote and send it to client side. Below is my strace output. You can find that with gin.Stream, multiple reads and result in an actual write operation. This implementation increases throughput but actually involves in latency. However if we add flush operation, each read operation will result in an actual write, that definately send buffer to client immediately, and makes output smooth.

I attached my strace file in attachments for your referennce.

Streaming strace output:

2192667 epoll_pwait(4,  <unfinished ...>
2192665 epoll_pwait(4,  <unfinished ...>
2192667 <... epoll_pwait resumed>[], 128, 0, NULL, 2) = 0
2192665 <... epoll_pwait resumed>[], 128, 0, NULL, 2) = 0
2192667 epoll_pwait(4,  <unfinished ...>
2192665 futex(0xd04d08, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
2192666 <... nanosleep resumed>NULL)    = 0
2192666 futex(0xd07d38, FUTEX_WAIT_PRIVATE, 0, {tv_sec=60, tv_nsec=0} <unfinished ...>
2192667 <... epoll_pwait resumed>[{events=EPOLLIN|EPOLLOUT, data={u32=2658360824, u64=139756599205368}}], 128, 346616, NULL, 2150252670990740) = 1
2192667 futex(0xd07d38, FUTEX_WAKE_PRIVATE, 1) = 1
2192666 <... futex resumed>)            = 0
2192667 read(8,  <unfinished ...>
2192666 nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
2192667 <... read resumed>"\27\3\3\3\375\236:\275\267vQ\250\324\233\330\7\335\6\35MO\20\2\376\4v|p\304f>h\n\305T\256^xp\0c\224-3~\261l+\231[\251\306\25\331\203\583\17\226\307\202\245\343\214\330k\266\343\252\234-\363\211_^0$>\317\347\325Or\355\306~\270&\220\373O\316L\235\231:\214^\322I\303\353t\32\204\202\204\2006\363aM\333\241K\33\35\334\0167\23C^\205T\244/\356\252\211ZO\2565\347\341\360\367\16\345:\213#\353\351{\327\f\26\23\356\264\257\3319.\247\25\362\t\0300p\226\256\243\25m~\2157B\357S\235\1v\37\2\\\274\365\35\36o5\341\4Y\242\351\2#\5V$Ej\220,\262\177rM\335\323N\253\231\334\24\236\357\355\225\205#\207\252\10\272\233\370\31\357^[\nt\316\225s\324\364GUd\10\327\217\226e{\237\7`\200,[\211\342\353\345D\""..., 3156) = 1026
2192667 futex(0xd04d08, FUTEX_WAKE_PRIVATE, 1) = 1
2192666 <... nanosleep resumed>NULL)    = 0
2192665 <... futex resumed>)            = 0
2192667 read(8,  <unfinished ...>
2192665 epoll_pwait(4,  <unfinished ...>
2192667 <... read resumed>0xc000428000, 3156) = -1 EAGAIN (Resource temporarily unavailable)
2192665 <... epoll_pwait resumed>[], 128, 0, NULL, 0) = 0
2192666 nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
2192665 epoll_pwait(4,  <unfinished ...>
2192667 epoll_pwait(4, [], 128, 0, NULL, 2) = 0
2192667 futex(0xc000022d48, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
2192666 <... nanosleep resumed>NULL)    = 0
2192666 futex(0xd07d38, FUTEX_WAIT_PRIVATE, 0, {tv_sec=60, tv_nsec=0} <unfinished ...>
2192665 <... epoll_pwait resumed>[{events=EPOLLIN|EPOLLOUT, data={u32=2658360824, u64=139756599205368}}], 128, 346540, NULL, 2150252670990740) = 1
2192665 futex(0xd07d38, FUTEX_WAKE_PRIVATE, 1) = 1
2192666 <... futex resumed>)            = 0
2192665 read(8, "\27\3\3\4\3\20\372$+Lu@\21a\333\3623\376\354H\370u-\353W\373e\177\226!|R\372\263H\234\371\301\7.j'U|8\362s\224\262L'\315\300q\236\363,X\36\31\221\260#ib\203\33[(\214\203\322D\335\313\320x\3060\3763\6\311\254\20pw\240T\375l\323(lh\367\27mT\373\334\263#\0\270\341e*\26{(\247\211\2102j\27f%G\265\360Rw\327\264'KG\r&%\177\21\10q#\303\352ZNy\354\346<\37\311\246\266|\330\31\t\267%m\313\2143bi'\33\365}\27X\272'>6b\203\16\375\203\230\\\366\3173\256W\351\311\"\23\366N\242\235\200\331\273\312\276\0052\231\"\327!\345\7\253\202>$\311fq\304\352\344\n\217\335\260\355\2065\367DL\0\360\342\252\36\33cC\276\373G\352Y\357\356\353\353\226\16%5\250\371\32\254\373\334\321\232t\v\276"..., 3156) = 1032
2192666 nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
2192665 futex(0xc000022d48, FUTEX_WAKE_PRIVATE, 1) = 1
2192667 <... futex resumed>)            = 0
2192667 epoll_pwait(4,  <unfinished ...>
2192665 read(8,  <unfinished ...>
2192667 <... epoll_pwait resumed>[], 128, 0, NULL, 0) = 0
2192665 <... read resumed>0xc000428000, 3156) = -1 EAGAIN (Resource temporarily unavailable)
2192667 epoll_pwait(4,  <unfinished ...>
2192666 <... nanosleep resumed>NULL)    = 0
2192665 epoll_pwait(4,  <unfinished ...>
2192667 <... epoll_pwait resumed>[], 128, 0, NULL, 2) = 0
2192665 <... epoll_pwait resumed>[], 128, 0, NULL, 2) = 0
2192667 epoll_pwait(4,  <unfinished ...>
2192666 nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
2192665 futex(0xd04d08, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
2192666 <... nanosleep resumed>NULL)    = 0
2192666 futex(0xd07d38, FUTEX_WAIT_PRIVATE, 0, {tv_sec=60, tv_nsec=0} <unfinished ...>
2192667 <... epoll_pwait resumed>[{events=EPOLLIN|EPOLLOUT, data={u32=2658360824, u64=139756599205368}}], 128, 346461, NULL, 2150252670990740) = 1
2192667 futex(0xd07d38, FUTEX_WAKE_PRIVATE, 1) = 1
2192667 read(8,  <unfinished ...>
2192666 <... futex resumed>)            = 0
2192667 <... read resumed>"\27\3\3\4\tf\312\262\333H4\5\250\323\35\223\261tCGg\363\377\244\342\223o\364\336f\263D\254>.r3rN\311\324\3249\304\305\372\\\276\316q_\325\3 Q\347\254\3\4\370I\7\271b\341\333\37J\226\30\355\250T\302b\226\323\354\227%\252\333\240\272N\336\347\215I\362?\336\214\261\331\17v\215X\244\r\23\261\275?a\227\26\"\356\205\36\374XV\222\240a\237\210\2\211\345JS\320ha/U\360\1|\222\232h\371\243\352I\313_\230g-\363\212\267\360\236\3\347\342,\301\2616\361\351\267\4\325\343\333\323\250\3478WL\377\370(\25\262\306\342\231l\251.\341-\331 \307\t!CAz\367\270\fP#8\367\202pA\355\251c\3+\302w0\0321\231\240\237\200\364\59\264\213O\220\324\326~\364;z\1\v]HR\351s7/\365\225\227\236r\177\224\245`\1u\371T(\252\366\253\226\241"..., 3156) = 1038
2192666 nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
2192667 futex(0xd04d08, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
2192665 <... futex resumed>)            = 0
2192667 <... futex resumed>)            = 1
2192665 epoll_pwait(4,  <unfinished ...>
2192667 read(8,  <unfinished ...>
2192666 <... nanosleep resumed>NULL)    = 0
2192665 <... epoll_pwait resumed>[], 128, 0, NULL, 0) = 0
2192666 nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
2192667 <... read resumed>0xc000428000, 3156) = -1 EAGAIN (Resource temporarily unavailable)
2192665 write(7, "7218774a7e2\", \"error\": null}\n\ndata: {\"message\": {\"id\": \"aad863f4-bef6-4a7a-861d-64438d729511\", \"author\": {\"role\": \"assistant\", \"name\": null, \"metadata\": {}}, \"create_time\": 1683434967.675813, \"update_time\": null, \"content\": {\"content_type\": \"text\", \"parts\""..., 4096 <unfinished ...>

Flush strace output

2192667 epoll_pwait(4,  <unfinished ...>
2192665 epoll_pwait(4,  <unfinished ...>
2192667 <... epoll_pwait resumed>[], 128, 0, NULL, 2) = 0
2192665 <... epoll_pwait resumed>[], 128, 0, NULL, 2) = 0
2192667 epoll_pwait(4,  <unfinished ...>
2192665 futex(0xd04d08, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
2192666 <... nanosleep resumed>NULL)    = 0
2192666 futex(0xd07d38, FUTEX_WAIT_PRIVATE, 0, {tv_sec=60, tv_nsec=0} <unfinished ...>
2192667 <... epoll_pwait resumed>[{events=EPOLLIN|EPOLLOUT, data={u32=2658360824, u64=139756599205368}}], 128, 346616, NULL, 2150252670990740) = 1
2192667 futex(0xd07d38, FUTEX_WAKE_PRIVATE, 1) = 1
2192666 <... futex resumed>)            = 0
2192667 read(8,  <unfinished ...>
2192666 nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
2192667 <... read resumed>"\27\3\3\3\375\236:\275\267vQ\250\324\233\330\7\335\6\35MO\20\2\376\4v|p\304f>h\n\305T\256^xp\0c\224-3~\261l+\231[\251\306\25\331\203\583\17\226\307\202\245\343\214\330k\266\343\252\234-\363\211_^0$>\317\347\325Or\355\306~\270&\220\373O\316L\235\231:\214^\322I\303\353t\32\204\202\204\2006\363aM\333\241K\33\35\334\0167\23C^\205T\244/\356\252\211ZO\2565\347\341\360\367\16\345:\213#\353\351{\327\f\26\23\356\264\257\3319.\247\25\362\t\0300p\226\256\243\25m~\2157B\357S\235\1v\37\2\\\274\365\35\36o5\341\4Y\242\351\2#\5V$Ej\220,\262\177rM\335\323N\253\231\334\24\236\357\355\225\205#\207\252\10\272\233\370\31\357^[\nt\316\225s\324\364GUd\10\327\217\226e{\237\7`\200,[\211\342\353\345D\""..., 3156) = 1026
2192667 futex(0xd04d08, FUTEX_WAKE_PRIVATE, 1) = 1
2192666 <... nanosleep resumed>NULL)    = 0
2192665 <... futex resumed>)            = 0
2192667 read(8,  <unfinished ...>
2192665 epoll_pwait(4,  <unfinished ...>
2192667 <... read resumed>0xc000428000, 3156) = -1 EAGAIN (Resource temporarily unavailable)
2192665 <... epoll_pwait resumed>[], 128, 0, NULL, 0) = 0
2192666 nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
2192665 epoll_pwait(4,  <unfinished ...>
2192667 epoll_pwait(4, [], 128, 0, NULL, 2) = 0
2192667 futex(0xc000022d48, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
2192666 <... nanosleep resumed>NULL)    = 0
2192666 futex(0xd07d38, FUTEX_WAIT_PRIVATE, 0, {tv_sec=60, tv_nsec=0} <unfinished ...>
2192665 <... epoll_pwait resumed>[{events=EPOLLIN|EPOLLOUT, data={u32=2658360824, u64=139756599205368}}], 128, 346540, NULL, 2150252670990740) = 1
2192665 futex(0xd07d38, FUTEX_WAKE_PRIVATE, 1) = 1
2192666 <... futex resumed>)            = 0
2192665 read(8, "\27\3\3\4\3\20\372$+Lu@\21a\333\3623\376\354H\370u-\353W\373e\177\226!|R\372\263H\234\371\301\7.j'U|8\362s\224\262L'\315\300q\236\363,X\36\31\221\260#ib\203\33[(\214\203\322D\335\313\320x\3060\3763\6\311\254\20pw\240T\375l\323(lh\367\27mT\373\334\263#\0\270\341e*\26{(\247\211\2102j\27f%G\265\360Rw\327\264'KG\r&%\177\21\10q#\303\352ZNy\354\346<\37\311\246\266|\330\31\t\267%m\313\2143bi'\33\365}\27X\272'>6b\203\16\375\203\230\\\366\3173\256W\351\311\"\23\366N\242\235\200\331\273\312\276\0052\231\"\327!\345\7\253\202>$\311fq\304\352\344\n\217\335\260\355\2065\367DL\0\360\342\252\36\33cC\276\373G\352Y\357\356\353\353\226\16%5\250\371\32\254\373\334\321\232t\v\276"..., 3156) = 1032
2192666 nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
2192665 futex(0xc000022d48, FUTEX_WAKE_PRIVATE, 1) = 1
2192667 <... futex resumed>)            = 0
2192667 epoll_pwait(4,  <unfinished ...>
2192665 read(8,  <unfinished ...>
2192667 <... epoll_pwait resumed>[], 128, 0, NULL, 0) = 0
2192665 <... read resumed>0xc000428000, 3156) = -1 EAGAIN (Resource temporarily unavailable)
2192667 epoll_pwait(4,  <unfinished ...>
2192666 <... nanosleep resumed>NULL)    = 0
2192665 epoll_pwait(4,  <unfinished ...>
2192667 <... epoll_pwait resumed>[], 128, 0, NULL, 2) = 0
2192665 <... epoll_pwait resumed>[], 128, 0, NULL, 2) = 0
2192667 epoll_pwait(4,  <unfinished ...>
2192666 nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
2192665 futex(0xd04d08, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
2192666 <... nanosleep resumed>NULL)    = 0
2192666 futex(0xd07d38, FUTEX_WAIT_PRIVATE, 0, {tv_sec=60, tv_nsec=0} <unfinished ...>
2192667 <... epoll_pwait resumed>[{events=EPOLLIN|EPOLLOUT, data={u32=2658360824, u64=139756599205368}}], 128, 346461, NULL, 2150252670990740) = 1
2192667 futex(0xd07d38, FUTEX_WAKE_PRIVATE, 1) = 1
2192667 read(8,  <unfinished ...>
2192666 <... futex resumed>)            = 0
2192667 <... read resumed>"\27\3\3\4\tf\312\262\333H4\5\250\323\35\223\261tCGg\363\377\244\342\223o\364\336f\263D\254>.r3rN\311\324\3249\304\305\372\\\276\316q_\325\3 Q\347\254\3\4\370I\7\271b\341\333\37J\226\30\355\250T\302b\226\323\354\227%\252\333\240\272N\336\347\215I\362?\336\214\261\331\17v\215X\244\r\23\261\275?a\227\26\"\356\205\36\374XV\222\240a\237\210\2\211\345JS\320ha/U\360\1|\222\232h\371\243\352I\313_\230g-\363\212\267\360\236\3\347\342,\301\2616\361\351\267\4\325\343\333\323\250\3478WL\377\370(\25\262\306\342\231l\251.\341-\331 \307\t!CAz\367\270\fP#8\367\202pA\355\251c\3+\302w0\0321\231\240\237\200\364\59\264\213O\220\324\326~\364;z\1\v]HR\351s7/\365\225\227\236r\177\224\245`\1u\371T(\252\366\253\226\241"..., 3156) = 1038
2192666 nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
2192667 futex(0xd04d08, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
2192665 <... futex resumed>)            = 0
2192667 <... futex resumed>)            = 1
2192665 epoll_pwait(4,  <unfinished ...>
2192667 read(8,  <unfinished ...>
2192666 <... nanosleep resumed>NULL)    = 0
2192665 <... epoll_pwait resumed>[], 128, 0, NULL, 0) = 0
2192666 nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
2192667 <... read resumed>0xc000428000, 3156) = -1 EAGAIN (Resource temporarily unavailable)
2192665 write(7, "7218774a7e2\", \"error\": null}\n\ndata: {\"message\": {\"id\": \"aad863f4-bef6-4a7a-861d-64438d729511\", \"author\": {\"role\": \"assistant\", \"name\": null, \"metadata\": {}}, \"create_time\": 1683434967.675813, \"update_time\": null, \"content\": {\"content_type\": \"text\", \"parts\""..., 4096 <unfinished ...>
2
acheong08 commented 1 year ago

Hmm ok. The difference is minimal since the latency feels to be only a few ms whereas the fetch from the API varies significantly which may be why my measurements were off. I'll merge again

wangjiyang commented 1 year ago

flush.trace.txt stream.trace.txt

acheong08 commented 1 year ago

flush.trace.txt stream.trace.txt

thanks